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ATGGGTGCGAGAGCGTCAGTATTAAGCGGGGGAGAATTAGATCGATGGGAAAAAAT 

TCGGTTAAGGCCAGGGGGAAAGAAGAAGTACAAGCTAAAGCACATCGTATGGGCAA 

GCAGGGAGCTAGAACGATTCGCAGTTAATCCTGGCCTGTTAGAAACATCAGAAGGC 

TGTAGACAAATACTGGGACAGCTACAACCATCCCTTCAGACAGGATCAGAGGAGCT 

TCGATCACTATACAACACAGTAGCAACCCTCTATTGTGTGCACCAGCGGATCGAGA 

TCAAGGACACCAAGGAAGCTTTAGACAAGATAGAGGAAGAGCAAAACAAGTCCAAG 

A^GAAGGCCCAGCAGGCAGCAGCTGACACAGGACACAGCAATCAGGTCAGCCAAAA 

TTACCCTATAGTGCAGAACATCCAGGGGCAAATGGTACATCAGGCCATATCACCTA 

GAACTTTAAATGCATGGGTAAAAGTAGTAGAAGAGAAGGCTTTCAGCCCAGAAGTG 

ATACCCATGTTTTCAGCATTATCAGAAGGAGCCACCCCACAGGACCTGAACACGAT 

GTTGAACACCGTGGGGGGACATCAAGCAGCCATGCAAATGTTAAAAGAGACCATCA 

ATGAGGAAGCTGCAGAATGGGATAGAGTGCATCCAGTGCATGCAGGGCCTATTGCA 

CCAGGCCAGATGAGAGAACCAAGGGGAAGTGACATAGCAGGAACTACTAGTACCCT 

TCAGGAACAAATAGGATGGATGACAAATAATCCACCTATCCCAGTAGGAGAGATCT 

ACAAGAGGTGGATAATCCTGGGATTGAACAAGATCGTGAGGATGTATAGCCCTACC 

AGCATTCTGGACATAAGACAAGGACCAAAGGAACCCTTTAGAGACTATGTAGACCG 

GTTCTATAAAACTCTAAGAGCTGAGCAAGCTTCACAGGAGGTAAAAAATTGGATGA 

CAGAAACCTTGTTGGTCCAAAATGCGAACCCAGATTGTAAGACCATCCTGAAGGCT 

CTCGGCCCAGCGGCTACACTAGAAGAAATGATGACAGCATGTCAGGGAGTAGGAGG 

ACCCGGCCATAAGGCAAGAGTTTTGGCCGAGGCGATGAGCCAGGTGACGAACTCGG 

CGACCATAATGATGCAGAGAGGCAACTTCCGGAACCAGCGGAAGATCGTCAAGTGC 

TTCAATTGTGGCAAAGAAGGGCACACCGCCAGGAACTGCCGGGCCCCCCGGAAGAA 

GGGCTGTTGGAAATGTGGAAAGGAAGGACACCAAATGAAAGATTGTACTGAGAGAC 
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AGGCTAATTTTTTAGGGAAGATCTGGCCTTCCTACAAGGGAAGGCCAGGGAATTTT 

CTTCAGAGCAGACCAGAGCCAACAGCCCCACCAGAAGAGAGCTTCAGGTCTGGGGT 

AGAGACAACAACTCCCCCTCAGAAGCAGGAGCCGATAGACAAGGAACTGTATCCTT 

TAACTTCCCTCAGATCACTCTTTGGCAACGACCCCTCGTCACAGTAAGGATCGGGG 

GGCAACTCAAGGAAGCGCTGCTCGATACAGGAGCAGATGATACAGTATTAGAAGAA 

ATGAGTTTGCCAGGAAGATGGAAACCAAAAATGATAGGGGGGATCGGGGGCTTCAT 

CA^GGTGAGGCAGTACGACCAGATACTCATAGAAATCTGTGGACATAAAGCTATAG 

GTACAGTATTAGTAGGACCTACACCTGTCAACATAATTGGAAGAAATCTGTTGACC 

CAGATCGGCTGCACCTTGAACTTCCCCATCAGCCCTATTGAGACGGTGCCCGTGAA . 

GTTGAAGCCGGGGATGGACGGCCCCAAGGTCAAGCAATGGCCATTGACGAAAGAGA 

AGATCAAGGCCTTAGTCGAAATCTGTACAGAGATGGAGAAGGAAGGGAAGATCAGC 

AAGATCGGGCCTGAGAACCCCTACAACACTCCAGTCTTCGCAATCAAGAAGAAGGA 

CAGTACCAAGTGGAGAAAGCTGGTGGACTTCAGAGAGCTGAACAAGAGAACTCAGG 

ACTTCTGGGAAGTTCAGCTGGGCATCCCACATCCCGCTGGGTTGAAGAAGAAGAAG 

TCAGTGACAGTGCTGGATGTGGGTGATGCCTACTTCTCCGTTCGCTTGGACGAGGA 

CTTCAGGAAGTACACTGCCTTCACGATACCTAGCATCAACAACGAGACACCAGGCA 

TCCGCTACCAGTACAACGTGCTGCCACAGGGATGGAAGGGATCACCAGCCATCTTT 

CAAAGCAGCATGACCAAGATCCTGGAGCCCTTCCGCAAGCAAAACCCAGACATCGT 

GATCTATCAGTACATGGACGACCTCTACGTAGGAAGTGACCTGGAGATCGGGCAGC 

ACAGGACCAAGATCGAGGAGCTGAGACAGCATCTGTTGAGGTGGGGACTGACCACA 

CCAGACAAGAAGCACCAGAAGGAACCTCCCTTCCTGTGGATGGGCTACGAACTGCA 

TCCTGACAAGTGGACAGTGCAGCCCATCGTGCTGCCTGAGAAGGACAGCTGGACTG 

TGAACGACATACAGAAGCTCGTGGGCAAGTTGAACTGGGCAAGCCAGATCTACCCA 

GGCATCAAAGTTAGGCAGCTGTGCAAGCTGCTTCGAGGAACCAAGGCACTGACAGA 
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AGTGATCCCACTGACAGAGGAAGCAGAGCTAGAACTGGCAGAGAACCGAGAGATCC 

TGAAGGAGCCAGTACATGGAGTGTACTACGACCCAAGCAAGGACCTGATCGCAGAG 

ATCCAGAAGCAGGGGCAAGGCCAATGGACCTACCAAATCTACCAGGAGCCCTTCAA 

GAACCTGAAGACAGGCAAGTACGCAAGGATGAGGGGTGCCCACACCAACGATGTGA 

AGCAGCTGACAGAGGCAGTGCAGAAGATCACCACAGAGAGCATCGTGATCTGGGGC 

AAGACTCCCAAGTTCAAGCTGCCCATACAGAAGGAGACATGGGAGACATGGTGGAC 

CGAGTACTGGCAAGCCACCTGGATCCCTGAGTGGGAGTTCGTGAACACCCCTCCCT 

TGGTGAAACTGTGGTATCAGCTGGAGAAGGAACCCATCGTGGGAGCAGAGACCTTC 

TACGTGGATGGGGCAGCCAACAGGGAGACCAAGCTGGGCAAGGCAGGCTACGTGAC 

CAACCGAGGACGACAGAAAGTGGTGACCCTGACTGACACCACCAACCAGAAGACTG 

AGCTGCAAGCCATCTACCTAGCTCTGCAAGACAGCGGACTGGAAGTGAACATCGTG 

ACAGACTCACAGTACGCACTGGGCATCATCCAAGCACAACCAGACCAATCCGAGTC 

AGAGCTGGTGAACCAGATCATCGAGCAGCTGATCAAGAAGGAGAAAGTGTACCTGG 

CATGGGTACCAGCACACAAAGGAATTGGAGGAAATGAACAAGTAGATAAATTAGTC 

AGTGCTGGGATCCGGAAGGTGCTGTTCCTGGACGGGATCGATAAGGCCCAAGATGA 

ACATGAGAAGTACCACTCCAACTGGOGCGCTATGGCCAGCGACTTCAACCTGCCAC 

CTGTAGTAGCAAAAGAAATAGTAGCCAGCTGTGATAAATGTCAGCTAAAAGGAGAA 

GCCATGCATGGACAAGTAGACTGTAGTCCAGGAATATGGCAGCTGGACTGCACGCA 

CCTGGAGGGGAAGGTGATCCTGGTAGCAGTTCATGTAGCCAGTGGATATATAGAAG 

CAGAAGTTATCCCTGCTGAAACTGGGCAGGAAACAGCATATTTTCTTTTAAAATTA 

GCAGGAAGATGGCCAGTAAAAACAATACACACGGACAACGGAAGCAACTTCACTGG 

TGCTACGGTTAAGGCCGCCTGTTGGTGGGC<3GGAATCAAGCAGGAATTTGGAATTC 

CCTACAATCCCCAATCGCAAGGAGTCGTGGAGAGCATGAACAAGGAGCTGAAGAAG 

ATCATCGGACAAGTGAGGGATCAGGCTGAGCACCTGAAGACAGCAGTGCAGATGGC 



AGTGTTCATCCACAACTTCAAAAGAAAAGGGGGGATTGGGGGGTACAGTGCAGGGG 
AAAGGATCGTGGACATCATCGCCACCGACATCCAAACCAAGGAGCTGCAGAAGCAG 
ATCACCAAGATCCAGAACTTCCGGGTGTACTACCGCGACAGCCGCAACCCACTGTG 
GAAGGGACCAGCAAAGCTCCTCTGGAAGGGAGAGGGGGCAGTGGTGATCCAGGACA 
ACAGTGACATCAAAGTGGTGCCAAGGCGCAAGGCCAAGATCATCCGCGACTATGGA 
AAACAGATGGCAGGTGATGATTGTGTGGCAAGTAGACAGGATGAGGATTAGAACCT 
GGAAGAGCCTGG TGAAGCACCATATG (SEQUENCE ID NO:l) 



S//.5 



>wildtype 
>mutated 
#1 



>wildtypc 
>mutated 
#41 



>wildtype 
>mutated 
#81 



TGTACAGAGA 
TGTACAGAGA 



TGGAAAAGGA 
TGGAGAAGGA 



AGGGAAAATT 
AGGGAAGATC 



TCAAAAATTG 
AGCAAGATCG 



* *** * * 



GGCCTGAAAA 
GGCCTGAGAA 



TCCATACAAT 
CCCCTACAAC 



ACTCCAGTAT 
ACTCCACTCT 



TTGCCATAAA 
TCGCAATCAA 



GAAAAAAGAC 
GAAGAAGGAC 



AGTACTAAAT 
AGTACCAAGT 



GGAGAAAATT 
GGAGAAAGCT 



* * * 

AGT AGATTT C 
GGTGGACTTC 



* * ♦ 



>wildtype 
>mutated 
#121 



AGAGAACTTA 
AGAGAGCTGA 



ATAAGAGAAC 
ACAAGAGAAC 



TCAAGACTTC 
TCAGGACTTC 



TGGGAAGTTC 
TGGGAAGTTC 



>wildtype 
>mutated 
#161 



AATTAGGAAT 
AGCTGGGCAT 



ACCACATCCC 
CCCACATCCC 



GCAGGGTTAA 
GCTGGGTTGA 



AAAAGAAAAA 
AGAAGAAGAA 



>wildtype 
>mutated 
#201 



ATCAGTAACA 
GTCAGTGACA 



GTACTGGATG 
GTGCTGGATG 



TGGGTGATGC 
TGGGTGATGC 



ATATTTTTCA 
CTACTTCTCC 



>wildtype 
>mutated 
#241 



GTTCCCTTAG 
GTTCCCTTGG 



ATGAAGACTT 
ACGAGGACTT 



CAGGAAATAT 
CAGGAAGTAC 



ACTGCATTTA 
ACTGCCTTCA 



>wildtype 
>mutated 
#281 



CCATACCTAG 
CGATACCTAG 



TATAAACAAT 
CATCAACAAC 



GAGACACCAG 
GAGACACCAG 



GGATTAGATA 
GCATCCGCTA 



>wildtype 
>mutated 
#321 



TCAGTACAAT 
CCAGTACAAC 



GTGCTTCCAC 
GTGCTGCCAC 



AGGGATGGAA 
AGGGATGGAA 



AGGATCACCA 
GGGATCACCA 



>wildtype 
>mutatcd 
#361 



GCAATATTCC 
GCCATCTTTC 



AAAGTAGCAT 
AAAGCAGCAT 



GACAAAAATC 
GACCAAGATC 



TTAGAGCCTT 
CTGGACCCCT 



* *• * 



>wildtype 
>mutated 
#401 



TTAGAAAACA 
TCCGCAAGCA 



AAATCCAGAC 
AAACCCAGAC 



ATAGTTATCT 
ATCGTGATCT 



ATCAATACAT 
ATCAGTACAT 
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>wildtype 
>mutated 
#441 



>wildtype 
>mutated 
#481 



GGATGATTTG 
GGACGACCTC 



TATGTAGGAT 
TACGTAGGAA 



CTGACTTAGA 
GTGACCTGGA 



AATAGGG CAG 
GATCGGGCAG 



CATAGAACAA 
CACAGGACCA 



AAATAGAGGA 
AGATCGAGGA 



GCTGAGACAA 
GCTGAGACAG 



CATCTGTTGA 
CATCTGTTGA 



>wildtype 
>mutated 
#521 



GGTGGGGACT 
GGTGGGGACT 



TACCACACCA 
GACCACACCA 



GACAAAAAAC 
GACAAGAAGC 



ATCAGAAAGA 
ACCAGAAGGA 



>wildtype 
>mutated 
#561 



ACCTCCATTC 
ACCTCCCTTC 



CTTTGGATGG 
CTGTGGATGG 



GTTATGAACT 
GCTACGAACT 



CCATCCTGAT 
GCATCCTGAC 



>wildtype 
>mutated 
#601 



AAATGGACAG 
AAGTGGACAG 



TACAGCCTAT 
TGCAGCCCAT 



AGTGCTGCCA 
CGTGCTGCCT 



GAAAAAGACA 
GAGAAGGACA 



>wildtype 
>nutated 
#641 



GCTGGACTGT 
GCTGGACTGT 



CAATGACATA 
GAACGACATA 



CAGAAGTTAG 
CAGAAGCTCG 



TGGGGAAATT 
TGGGCAAGTT 



>wildtype 
>mutated 
#681 



GAATTGGGCA 
GAACTGGGCA 



AGTCAGATTT 
AGCCAGATCT 



ACCCAGGGAT 
ACCCAGGCAT 



TAAAGTAAGG 
CAAAGTTAGG 



>wildtype 
>mutatcd 
#721 



CAATTATGTA 
CAGCTGTGCA 



AACTCCTTAG 
AGCTGCTTCG 



AGGAACCAAA 
AG GAACCAAG 



GCACTAACAG 
GCACTGACAG 



>wildtype 
>mutatcd 
#761 



AAGTAATACC 
AAGTGATCCC 



ACTAACAGAA 
ACTGACAGAG 



GAAGCAGAGC 
GAAGCAGAGC 



TAGAACTGGC 
TAGAACTGGC 



>wildtype 
>mutated 
#801 



AGAAAACAGA GAGATTCTAA AAGAACCAGT ACATGGAGTG 
AGAGAACCGA GAGATCCTGA AGGAGCCAGT ACATGGAGTG 



>wildtype 
>mutated 
#841 



TATTATGACC CATCAAAAGA CTTAATAGCA GAAATACAGA 
TACTACGACC CAAGCAAGGA CCTGATCGCA GAGATCCAGA 
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>wildtype AGCAGGGGCA AGGCCAATGG ACATATCAAA TTTATCAAGA 
>mutated AGCAGGGGCA AGGCCAATGG ACCTACCAAA TCTACCAGGA 
*881 



>wildtype GCCATTTAAA AATCTGAAAA CAGGAAAATA TGCAAGAATG 
>mutated GCCCTTCAAG AACCTGAAGA CAGGCAAGTA CGCAAGGATG 
H921 



>wildtype AGGGGTGCCC ACACTAATGA TGTAAAACAA TTAACAGAGG 
>mutatcd AGGGGTGCCC ACACCAACGA TGTGAAGCAG CTGACAGAGG 
«961 



> wild type CAGTGCAAAA AATAACCACA GAAAGCATAG TAATATGGGG 
>mutated CAGTGCAGAA GATCACCACA GAGAGCATCG TGATCTGGGG 
tflOOl 



>wildtype AAAGACTCCT AAATTTAAAC TGCCCATACA AAAGGAAACA 
>mutated CAAGACTCCC AAGTTCAAGC TGCCCATACA GAAGGAGACA 
H1041 • 



>wildcype TGGGAA^CAT GGTGGACAGA GTATTGGCAA GCCACCTGGA 
>mutated TGGGAGACAT GGTGGACCGA GTACTGGCAA GCCACCTGGA 
«1081 



>wildtype 
>mutated 
«1121 



TTCCTGAGTG 
TCCCTGAGTG 



GGAGTTTGTT 
GGAGTTCGTG 



AATACCCCTC 
AACACCCCTC 



CTTTAGTGAA 
CCTTGGTGAA 



>wildtype 
>mutated 
#1161 



ATTATGGTAC 
ACTGTGGTAT 



CAGTTAGAGA 
CAGCTGGAGA 



AAGAACCCAT 
AGGAACCCAT 



AGTAGGAGCA 
CGTGGGAGCA 



>wildtype 
>mutated 
U1201 



GAAACCTTCT 
GAGACCTTCT 



ATGTAGATGG 
ACGTGGATGG 



GGCAGCTAAC 
GGCAGCCAAC 



AGGGAGACTA 
AGGGAGACCA 



>wildtypc 
>mutatcd 
#1241 



AATTAGGAAA 
AGCTGGGCAA 



AGCAGGATAT 
GGCAGGCTAC 



GTTACTAATA 
GTGACCAACC 



GAGGAAGACA 
GAGGACGACA 



>wildtypc AAAAGTTGTC ACCCTAACTG ACACAACAAA TCAGAAGACT 
>mutatcd GAAAGTGGTG ACCCTGACTG ACACCACCAA CCAGAAGACT 
81281 
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>wildtype GAGTTACAAG CAATTTATCT AGCTTTGCAG GATTCGGGAT 
>mutated GAGCTGCAAG CCATCTACCT AGCTCTGCAA GACAGCGGAC 
#1321 



>wildtype 
>mutated 
#1361 



TAGAAGTAAA CATAGTAACA GACTCACAAT ATGCATTAGG 
TGGAAGTGAA CATCGTGACA GACTCACAGT ACGCACTGGG 



>wildtype 
>mutated 
#1401 



AATCATTCAA GCACAACCAG ATCAAAGTGA ATCAGAGTTA 
CATCATCCAA GCACAACCAG ACCAATCCGA GTCAGAGCTG 



>wildtype 
>mutated 
#1441 



GTCAATCAAA TAATAGAGCA GTTAATAAAA AAGGAAAAGG 
GTGAACCAGA TCATCGAGCA GCTGATCAAG AAGGAGAAAG 



>wildtype 
>mutated 
#1481 



TCTATCTGGC ATGGGTACCA GCACACAAAG GAATTGGAGG 
TGTACCTGGC ATGGGTACCA GCACACAAAG GAATTGGAGG 



> wild type AAATGAACAA GTAGATAAAT TAGTCAGTGC TGGAATCAGG 
>mutated AAATGAACAA GTAGATAAAT TAGTCAGTGC TGGGATCCGG 
#1521 



>wildtype 
>mutated 
#1561 



AAAGTACTAT TTTTAGATGG AATAGATAAG GCCCAAGATG 
AAGGTGCTGT TCCTGGACGG GATCGATAAG GCCCAAGATG 



>wildtype 
>mutated 
#1601 



AACATGAGAA ATATCACAGT AATTGGAGAG CAATGGCTAG 
AACATGAGAA GTACCACTCC AACTGGCGCG CTATGGCCAG 

* « *** * ♦ ♦ * * 



>wildtype 
>mutated 
#1641 



TGATTTTAAC CTGCCACCTG TAGTAGCAAA AGAAATAGTA 
CGACTTCAAC CTGCCACCTG TAGTAGCAAA AGAAATAGTA 



>wildtype 
>mutatcd 
#1681 



GCCAGCTGTG ATAAATGTCA GCTAAAAGGA GAAGCCATGC 
GCCAGCTGTG ATAAATGTCA GCTAAAAGGA GAAGCCATGC 



>wildtype ATGGACAAGT AGACTGTAGT CCAGGAATAT GGCAACTAGA 
>mutated ATGGACAAGT AGACTGTAGT CCAGGAATAT GGCAGCTGGA 
#1721 
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> wild type TTGTACACAT TTAGAAGGAA AAGTTATCCT GGTAGCAGTT 
>mutated CTGCACGCAC CTGGAGGGGA AGGTGATCCT GGTAGCAGTT 
81761 



>wildtype CATGTAGCCA GTGGATATAT AGAAGCAGAA GTTATTCCAG 
>rautated CATGTAGCCA GTGGATATAT AGAAGCAGAA GTTATCCCTG 
«1801 



>wildtype CAGAAACAGG GCAGGAAACA GCATATTTTC TTTTAAAATT 
>mutated CTGAAACTGG GCAGGAAACA GCATATTTTC TTTTAAAATT 
«1841 



>wildtype AGCAGGAAGA TGGCCAGTAA AAACAATACA TACAGACAAT 
>mutated AGCAGGAAGA TGGCCAGTAA AAACAATACA CACGGACAAC 
K1681 



>wildtype GGCAGCAATT TCACCAGTGC TACGGTTAAG GCCGCCTGTT 
>mutated GGAAGCAACT TCACTGGTGC TACGGTTAAG GCCGCCTGTT 
S1921 



>wildtype 
>mutated 
S1961 



GGTGGGCGGG AATCAAGCAG GAATTTGGAA TTCCCTACAA 
GGTGGGCGGG AATCAAGCAG GAATTTGGAA TTCCCTACAA 



>wildtype 
>mutated 
«2001 



TCCCCAAAGT CAAGGAGTAG TAGAATCTAT GAATAAAGAA 
TCCCCAATCG GAAGGAGTCG TCGAGAGCAT GAACAAGGAG 

* * * * * * * * * + + * 



>wildtype 
>mutated 
«2041 



TTAAAGAAAA TTATAGGACA GGTAAGAGAT CAGGCTGAAC 
CTGAAGAAGA TCATCGGACA AGTGAGGGAT CAGGCTGAGC 



>wildtype ATCTTAAGAC AGCAGTACAA ATGGCAGTAT TCATCCACAA 

>mutated ACCTGAAGAC AGCAGTGCAG ATGGCAGTGT TCATCCACAA 

H2oei 



>wildtype TTTTAAAAGA AAAGGGGGGA TTGGGGGGTA CAGTGCAGGG 
>mutated CTTCAAAAGA AAAGGGGGGA TTGGGGGGTA CAGTGCAGGG 
#2121 



>wildtypc GAAAGAATAG TAGACATAAT AGCAACAGAC ATACAAACTA 

>mutated GAAAGGATCG TGGACATCAT CGCCACCGAC ATCCAAACCA 

82161 

• * * * * ♦ * ♦ * 
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>wildtype 
>mutated 
«2201 



AAGAATTACA AAAACAAATT ACAAAAATTC AAAATTTTCG 
AGGAGCTGCA GAAGCAGATC ACCAAGATCC AGAACTTCCG 



>wildtype 
>mutated 
«2241 



GGTTTATTAC AGGGACAGCA GAAATCCACT TTGGAAAGGA 
GGTGTACTAC CGCGACAGCC GCAACCCACT GTGGAAGGGA 



>wildtype 
>mutated 
K2281 



CCAGCAAAGC TCCTCTGGAA AGGTGAAGGG GCAGTAGTAA 
CCAGCAAAGC TCCTCTGGAA GGGAGAGGGG GCAGTGGTGA 



>wildtype 
>mutated 
«2321 



TACAAGATAA TAGTGACATA AAAGTAGTGC CAAGAAGAAA 
TCCAGGACAA CAGTGACATC AAAGTGGTGC CAAGGCGCAA 



>wildtype 
>mutatcd 
«2361 



AGCAAAGATC ATTAGGGATT ATGGAAAACA GATGGCAGGT 
GGCCAAGATC ATCCGCGACT ATGGAAAACA GATGGCAGGT 



>wildtype 
>mutated 
JS2401 



GATGATTGTG TGGCAAGTAG ACAGGATGAG GATTAGAACA 
GATGATTGTG TGGCAAGTAG ACAGGATGAG GATTAGAACC 



>wildtype 
>mutated 
K2441 



TGGAAAAGTT TAGTAAAACA CCATATG 
TGGAAGAGCC TGGTGAAGCA CCATATG 



ATGGGCGTGAGAAACTCCGTCTTGTCAGGGAAGAAAGCAGATGAATTAG 

AAAAAATTAGGCTACGACCCAACGGAAAGAAAAAGTACATGTTGAAGC 

ATGTAGTATGGGCAGCAAATGAATTAGATAGATTTGGATTAGCAGAAAG 

CCTGTTGGAGAACAAAGAAGGATGTCAAAAAATACTTTCGGTCTTAGCT 

CCATTAGTGCCAACAGGCTCAGAAAATTTAAAAAGCCTTTATAATACTG 

TCTGCGTCATCTGGTGCATTCACGCAGAAGAGAAAGTGAAACACACTGA 

GGAAGCAAAACAGATAGTGCAGAGACACCTAGTGGTGGAAACAGGAAC 

CACCGAAACCATGCCGAAGACCTCTCGACCAACAGCACCATCTAGCGGC 

AGAGGAGGAAACTACCCAGTACAGCAGATCGGTGGCAACTACGTCCAC 

CTGCCACTGTCCCCGAGAACCCTGAACGCTTGGGTCAAGCTGATCGAGG 

AGAAGAAGTTCGGAGCAGAAGTAGTGCCAGGATTCCAGGCACTGTCAG 

AAGGTTGCACCCCCTACGACATCAACCAGATGCTGAACTGCGTTGGAGA 

CCATCAGGCGGCTATGCAGATCATCCGTGACATCATCAACGAGGAGGCT 

GCAGATTGGGACTTGCAGCACCCACAACCAGCTCCACAACAAGGACAA 

CTTAGGGAGCCGTCAGGATCAGACATCGCAGGAACCACCTCCTCAGTTG 

ACGAACAGATCCAGTGGATGTACCGTCAGCAGAACCCGATCCCAGTAGG 

CAACATCTACCGTCGATGGATCCAGCTGGGTCTGCAGAAATGCGTCCGT 

ATGTACAACCCGACCAACATTCTAGATGTAAAACAAGGGCCAAAAGAG 

CC ATTTC AG AGCT ATGTAG AC AGGTTCTAC AAA AGTTTAAG AG C AG A AC 

AGACAGATGCAGCAGTAAAGAATTGGATGACTCAAACACTGCTGATTCA 

AAATGCTAACCCAGATTGCAAGCTAGTGCTGAAGGGGCTGGGTGTGAAT 

CCCACCCTAGAAGAAATGCTGACGGCTTGTCAAGGAGTAGGGGGGCCG 

GGACAGAAGGCTAGATTAATGGCAGAAGCCCTGAAAGAGGCCCTCGCA 

CCAGTGCCAATCCCTTTTGCAGCAGCCCAACAGAGGGGACCAAGAAAGC 

CAATTAAGTGTTGGAATTGTGGGAAAGAGGGACACTCTGCAAGGCAATG 

CAGAGCCCCAAGAAGACAGGGATGCTGGAAATGTGGAAAAATGGACCA 

TGTTATGGCCAAATGCCCAGACAGACAGGCGGG 11 11 TTAGGCCTTGGT 

CCATGGGGAAAGAAGCCCCGCAATTTCCCCATGGCTCAAGTGCATCAGG 

GGCTGATGCCAACTGCTCCCCCAGAGGACCCAGCTGTGGATCTGCTAAA 

GAACTACATGCAGTTGGGCAAGCAGCAGAGAGAAAAGCAGAGAGAAAG 

CAGAGAGAAGCCTTACAAGGAGGTGACAGAGGATTTGCTGCACCTCAAT 

TCTCTCTTTGGAGGAGACCAGTAG 



FIG. 3 
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SIV gag 

#1 

ATGGGCGTGAGAAACTCCGTCTTGTCAGGGAAGAAAGCAG 

SIV gag 

#41 

ATG AAT TAG AAAAAAT T AGGCTACGACCCAACGGAAAGAA 

SIV gag 

#81 

AAAGTACATGTTGAAGCATGTAGTATGGGCAGCAAATGAA 

SIV gag 

#121 

TTAGATAGATTTGGATTAGCAGAAAGCCTGTTGGAGAACA 

SIV gag 

#161 

AAGAAGGATGTCAAAAAATACTTTCGGTCTTAGCTCCATT 

SIV gag 

#201 

AGTGCCAACAGGCTCAGAAAATTTAAAAAGCCTTTATAAT 

SIV gag 

#241 

ACTGTCTGCGTCATCTGGTGCATTCACGCAGAAGAGAAAG 

SIV gag 

SIVgagDX . . 

#281 

TGAAACACACTGAGGAAGCAAAACAGATAGTGCAGAGACA 

SIV gag A— A T A—A 

SIVgagDX. . C — C C G — G 

#321 

CCTAGTGGTGGAAACAGGAACMACMGAAACYATGCCRAAR 

SIV gag — AAG-A 

SIVgagDX. .— CTC-C 

#361 

ACMWSTMGACCAACAGCACCATCTAGCGGCAGAGGAGGAA 
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SIV gag -T A — A— A T 

SIVgagDX. . -C G — G — C C 

#401 

AYTACCCAGTACARCARATMGGTGGTAACTAYGTCCACCT 

SIV gag T-AAG AT-A — T — C A — AT — 

S IVgagDX . . C-GTC CC-G--C--T C--GC— 

#441 

GCCAYTRWSCCCGAGAACMYTRAAYGCYTGGGTMAARYTG 

SIV gag --A A A— T 

SIVgagDX. . — C G G~C 

#481 

ATMGAGGARAAGAARTTYGGAGCAGAAGTAGTGCCAGGAT 

SIV gag -T T T — 

SIVgagDX. . -C -C C — 

#521 

TYCAGGCACTGTCAGAAGGTTGCACCCCCTAYGACATYAA 

SIV gag T T-A — T--T — G A 

SIVgagDX. .C C-G — C — C — T G 

#561 

YCAGATGYTRAAYTGYGTKGGAGACCATCARGCGGCTATG 

SIV gag T A- A — T — T — A 

SIVgagDX. . C C-T— C— C— C 

#601 

CAGATYATCMGWGAYATYATMAACGAGGAGGCTGCAGATT 

SIV gag 

SIVgagDX. 

#641 

GGGACTTGCAGCACCCACAACCAGCTCCACAACAAGGACA 

SIV gag T — T A — T 

SIVgagDX. . C— C C — C 

#681 

ACTTAGGGAGCCGTCAGGATCAGAYATYGCAGGAACMACY 

SIV gag AGT A— T A A-A — A- 

S IVgagDX. .TCC T — C G C-T — G- 

#721 

WS YTCAGTWGAYGAACARATCCAGTGGATGTACMGWCARC 



SIV gag C — A T A-GA 

SIVgagDX. . G — C C C-TC 

#761 . 

AGAACCCSATMCCAGTAGGCAACATYTACMGKMGATGGAT 

SIV gag A GT A — A — T — CA-A T A 

SIVgagDX. . G TC G — G — C — TC-T C G 

#801 

CCARCTGGGKYTGCARAARTGYGTYMGWATGTAYAACCCR 

SIV gag —A 

SIVgagDX. . — C 

#841 

ACMAACATTCTAGATGTAAAACAAGGGCCAAAAGAGCCAT 

SIV gag 

8881 

TTCAGAGCTATGTAGACAGGTTCTACAAAAGTTTAAGAGC 

SIV gag 

#921 

AGAACAGACAGATGCAGCAGTAAAGAATTGGATGACTCAA 

SIV gag 

#961 

ACACTGCTGATTCAAAATGCTAACCCAGATTGCAAGCTAG 

SIV gag 

#1001 

TGCTGAAGGGGCTGGGTGTGAATCCCACCCTAGAAGAAAT 

SIV gag 

#1041 

GCTGACGGCTTGTCAAGGAGTAGGGGGGCCGGGACAGAAG 

SIV gag 

#1081 

GCTAGATTAATGGCAGAAGCCCTGAAAGAGGCCCTCGCAC 

SIV gag 

#1121 

CAGTGCCAATCCCTTTTGCAGCAGCCCAACAGAGGGGACC 

SIV gag 

#1161 

AAGAAAGCCAAT TAAG TGT TGGAAT TG TGGG AAAGAGGGA 
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SIV gag 

#1201 

CACTCTGCAAGGCAATGCAGAGCCCCAAGAAGACAGGGAT 

SIV gag 

#1241 

GCTGGAAATGTGGAAAAATGGACCATGTTATGGCCAAATG 

SIV gag 

#1281 

CCCAGACAGACAGGCGGGTTTTTTAGGCCTTGGTCCATGG 

SIV gag 

#1321 

GGAAAGAAGCCCCGCAATTTCCCCATGGCTCAAGTGCATC 

SIV gag 

#1361 

AGGGGCTGATGCCAACTGCTCCCCCAGAGGACCCAGCTGT 

SIV gag 

#1401 

GGATCTGCTAAAGAACTACATGCAGTTGGGCAAGCAGCAG 

SIV gag : 

#1441 

AGAGAAAAGCAGAGAGAAAGCAGAGAGAAGCCTTACAAGG 

SIV gag 

#1481 

AGGTGACAGAGGATTTGCTGCACCTCAATTCTCTCTTTGG 

SIV gag 

#1521 

AGGAGACCAGTAG 
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BssHII (711) 
Clot (830) 
Accl (959) 




Clot (3259) 

Xhol (3548) 
Apol (3557) 
Kpnl {3563) 
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BssHII (711) 
Clol (830) 
Accl (959) 



Glal (3259) 

Xhol (3548) 
Apol (3557) 
Kpnl (3563)* 
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l CCTGGCCATT GCATACGTTG TATCCATATC ATAATATGTA CATTTATATT GGCTCATGTC CAACATTACC 
71 CCCATCTTGA CATTGATTAT TGACTAGTTA TTAATACTAA TCAATTACGG GGTCATTAGT TCATAGCCCA 
Ul TATATCGAGT TCCGCGTTAC ATAACTTACG CTAAATGGCC CGCCTGGCTG ACCGCCCAAC GACCCCCCCC 
211 CATTGACGTC AATAATGACG TATGTTCCCA TAGTAACGCC AATAGGGACT TTCCATTGAC GTCAATGGGT 
281 GGAGTATTTA CGCTAAACTG CCCACTTGGC AGTACATCAA GTGTATCATA TGCCAACTAC CCCCCCTATT 
351 GACGTCAATG ACGGTAAATG GCCCGCCTGG CATTATGCCC AGTACATGAC CTTATCGCAC TTTCCTACTT 
421 GGCAGTACAT CTACGTATTA GTCATCGCTA TTACCATGGT GATGCGGTTT TGGCAGTACA TCAATGGCCG 
491 TGGATAGCGG TTTGACTCAC GCGCACTTCC AAGTCTCCAC CCCATTGACG TCAATGGGAG TTTGTTTTGG 
561 CACCAAAATC AACGGGACTT TCCAAAATGT CGTAACAACT CCGCCCCATT GACGCAAATG CGCGGTAGCC 
631 GTGTACGGTG CGAGCTCTAT ATAAGCAGAG CTCGTTTAGT GAACCGTCAG ATCGCCTGGA GACCCCATCC 

Sail (758) 

701 ACGCTCTTTT GACCTCCATA GAACACACCG CGA CCGATCC AGCCTCCGCG CG CGCGCGTC GACAGAGAGA 
771 TG GGTGCGAG AGCGTCAGTA TTAAGCGGGG CAGAATTAGA TCGATGGGAA AAAATTCGGT TAAGGCCAGG 
8^i 1 GGGAAAGAAG AACTACAACC TAAAGCACAT CGTATGGGCA AGCAGGGAGC TAGAACGATT CGCAGTTAAT 
911 CCTGGCCTGT TAGAAACATC AGAAGGCTGT AGACAAATAC TGGGACAGCT ACAACCATCC CTTCAGACAG 
981 GATCAGAGGA GCTTCGATCA CTATACAACA CAGTAGCAAC CCTCTATTGT GTGCACCAGC GGATCGAGAT 
1051 CAAGGACACC AAGGAAGCTT TAGACAAGAT AGAGGAAGAG CAAAACAAGT CCAAGAAGAA GGCCCAGCAG 
1121 GCAGCAGCTG ACACAGGACA CAGCAATCAG CTCACCCAAA ATTACCCTAT ACTGCAGAAC ATCCAGGGGC 
1191 AAATGGTACA TCAGGCCATA TCACCTAGAA C TTTAAATGC ATGGGTAAAA CTACTACAAG AGAAGGCTTT 
1261 CAGCCCAGAA CTGATACCCA TGTTTTCAGC ATTATCAGAA GGAGCCACCC CACAGGACCT GAACACGATG 
1331 TTCAACACCG TGGGGGGACA TCAAGCAGCC ATGCAAATCT TAAAAGAGAC CATCAATGAG GAAGCTGCAG 
1401 AATGGGATAG ACTCCATCCA GTGCATGCAG GGCCTATTGC ACCAGGCCAG ATGAGAGAAC CAAGGGGAAG 
1471 TGACATAGCA CGAACTACTA GTACCCTTCA GGAACAAATA GGATGGATGA CAAATAATCC ACCTATCCCA 
1541 GTACCAGAGA TCTACAAGAG GTGGATAAT C CTGGGATTGA ACAACATCCT GAGGATCTAT AGCCCTACCA 
1611 GCATTCTGGA CATAACACAA GGACC AAA GG AACCCTTTAG AGACTATGTA GACCGGTTCT ATAAAACTCT 
1681 AAGAGCTGAG CAAGCTTCAC AGGAGGTAAA AAATTGGATG ACAGAAACCT TGTTGGTCCA AAATGCCAAC 
1751 CCAGATTGTA AGACCATCCT GAAGGCTCTC GGCCCAGCGG CTACACTAGA AGAAATGATG ACACCATCTC 
1821 AGGGAGTAGG AGGACCCGGC CATAAGGCAA GACTTTTGGC CGAGGCGATG AGCCAGGTGA CGAACTCGGC 
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1891 GACCATAATG ATGCAGACAG GCAACTTCCG GAACCAGCGG AAGATCGTCA AGTGCTTCAA TTGTCGCAAA 
1961 CAACCGCACA CCGCCAGGAA CTCCCGGGCC CCCCGGAAGA ACCCCTGTTG GAAATGTGGA AAGGAAGGAC 
2031 ACCAAATGAA AGATTGTACT GAGAGACACG CTAATTTTTT AGGGAAGATC TGGCCTTCCT ACAAGGGAAG 
2101 GCCAGGGAAT TTTCTTCAGA GCACACCACA GCCAACAGCC CCACCAGAAG AGAGCTTCAG GTCTGGGGTA 
2171 GAGACAACAA CTCCCCCTCA GAAGCAGGAG CCCATAGACA AGGAACTGTA TCCTTTAACT TCCCTCAGAT 
2241 CACTCTTTGG CAACGACCCC TCGTCACACT AACGATCGGG GCGCAACTCA AGGAAGCGCT CCTCCATACA 
2311 GGACCAGATG ATACAGTATT AGAAGAAATG AGTTTGCCAG GAAGATGGAA ACGAAAAATG ATAGGCCGGA 
2381 TCCGGGGCTT CATCAAGGTG AGGCAGTACG ACCAGATACT CATAGAAATC TGTGGACATA AAGCTATAGG 
2451 TACAGTATTA CTACCACCTA CACCTGTCAA CATAATTGCA AGAAATCTGT TGACCCAGAT CGGCTCCACC 
2521 TTGAACTTCC CCATCAGCCC TATTGAGACG GTGCCCGTGA ACTTGAAGCC GGGGATGGAC CCCCCCAAGG 
2591 TCAAGCAATG CCCATTGACG AAAGAGAAGA TCAAGGCCTT AGTCGAAATC TGTACAGAGA TGGAGAAGGA 
2661 AGGGAAGATC AGCAAGATCG GGCCTGAGAA CCCCTACAAC ACTCCAGTCT TCGCAATCAA GAAGAACGAC 
2731 AGTACCAAGT GGAGAAAGCT GGTGGACTTC AGAGAGCTGA ACAAGAGAAC TCAGGACTTC TGGGAAGTTC 
2801 AGCTGGGCAT CCCACATCCC GCTGGGTTGA ACAAGAAGAA GTCAGTGACA GTGCTGGATG TGGGTGATGC 
2871 CTACTTCTCC GTTCCCTTGG ACGAGGACTT CAGGAAGTAC ACTGCCTTCA CGATACCTAG CATCAACAAC 
2941 GACACACCAC GCATCCGCTA CCAGTACAAC GTGCTGCCAC AGGGATGGAA GGGATCACCA GCCATCTTTC 
3011 AAACCACCAT GACCAAGATC CTGGAGCCCT TCCGCAAGCA AAACCCAGAC ATCGTGATCT ATCAGTACAT 
3081 GGACGACCTC TACCTAGGAA GTGACCTGGA GATCGGGCAG CACAGGACCA AGATCGAGGA GCTGAGACAG 
3151 CATCTGTTGA CGTGGGGACT GACCACACCA GACAAGAAGC ACCAGAAGGA ACCTCCCTTC CTGTGGATGG 
3221 CCTACCAACT GCATCCTGAC AAGTGGACAG TGCAGCCCAT CGTGCTGCCT GAGAAGGACA GCTGCACTGT 
3291 GAACGACATA CACAAGCTCG TGGGCAAGTT GAACTGGGCA AGCCAGATCT ACCCAGGCAT CAAAGTTAGG 
3361 CAGCTGTGCA AGCTGCTTCG AGGAACCAAG CCACTGACAG AAGTGATCCC ACTGACAGAG GAAGCAGAGC 
3431 TAGAACTGGC AGAGAACCGA GAGATCCTGA ACGAGCCAGT ACATGGAGTG TACTACGACC CAAGCAAGCA 
3501 CCTGATCGCA GAGATCCAGA AGCAGGGGCA AGGCCAATGG ACCTACCAAA TCTACCAGGA GCCCTTCAAG 
3571 AACCTGAAGA CAGGCAAGTA CGCAAGGATG AGGGCTGCCC ACACCAACGA TGTGAAGCAG CTGACAGAGG 
36 A 1 CAGTGCAGAA GATCACCACA GAGAGCATCG TGATCTCGCG CAAGACTCCC AACTTCAAGC TGCCCATACA 
3711 GAAGGAGACA TGGGAGACAT GGTGGACCGA GTACTGGCAA GCCACCTGGA TCCCTGAGTG CGAGTTCGTG 
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3781 AACACCCCTC CCTTGCTCAA ACTCTCCTAT CAGCTCGAGA AG CAACCCAT CGTGCGAGCA GAGACCTTCT 

3851 ACGTCCATCC CGCACCCAAC AGGGAGACCA AGCTCGCCAA CCCACGC TAC GTGACCAACC GAGGACGACA 

3921 CAAAGTGGTC ACCCTCACTG ACACCACCAA CCAGAAGACT GAGCTGCAAG C CATCTACCT AGCTCTGCAA 

3991 GACACCGGAC TCGAACTGAA CATCGTGACA CACTCACACT ACGCACTCGG CATCA TCCAA GCACAACCAG 

4 061 ACCAATCCCA GTCACAGCTC GTCAACCAGA TCATCGAGCA CCTCATCAAC A AGGAGAAAC TGTACCTCGC 

4131 ATCGGTACCA GCACACAAAG CAATTCGACG AAATCAACAA GTAGATAAAT TAGTCAGTCC TGGGATCCGC 

^ 201 AACGTGCTGT TCCTCCACCC CATCCATAAG GCCCAAGATG AACATGAGAA CTACCACT CC AACTGGCGCG 

4 2 71 CTATGCCCAG CGACTTCAAC CTGCCACCTC TAGTAGCAAA AGAAATAGTA GCCAGCTGTG ATAAATCTCA 

43*1 CCTAAAAGGA GAAGCCATGC ATGCACAAGT AGAC TGTAGT CCACGAATAT GGCAGCTGGA CTGCACGCAC 

4411 CTCCACCGCA AGGTCATCCT GGTAGCAGTT CATGTAGCCA CTCGATAT AT AGAAGCAGAA GTTATCCCTG 

4481 CTGAAACTCG CCAGCAAACA GCATATTTTC TTTTAAAATT AGCAGGAACA TGGCCAGTAA AAACAATACA 

4551 CACCCACAAC GGAAGCAACT TCACTGGTCC TACGGTTAAG GCCCC CTCTT GGTGGGCGGG AATCAAGCAG 

* 621 GAATTTGGAA TTCCCTACAA TCCCCAATCG CAAGGAGTCG TCGAGAGCAT G AACAAGGAG CTGAACAACA 

4 691 TCATCGGACA AGTGAGGGAT CAGGCTGAGC ACCTGAACAC AGCAG TGCAG ATGGCAGTGT TCATCCACAA 

^ 761 CTTCAAAACA AAAGGGGGGA TTGGGGGGTA CAGTCCAGG G GAAAGGATCG TCGACATCAT CGCCACCGAC 

4831 ATCCAAACCA AGGAGCTGCA GAAGCAGATC ACCAACATCC AGAACTTCCG G GTGTACTAC CGCGACACCC 

^901 CCAACCCACT GTGGAACCGA CCAGCAAAGC TCCTCTGGAA GCCAGAGGCG GCACTCCTCA TCCAGGACAA 

^971 CACTGACATC AAACTCCTCC CAAGGCGCAA GGCCAAGATC ATCCGCGACT ATGGAAAACA CATGGCACCT 

5 °41 GATGATTGTG TCGCAAGTA C ACAGGATGAG GATTAGAACC TGGAAGAGCC TGGTGAAGCA CCATATGGCG 

Khel (5117) 
BstBI (5111) 

5111 TTCGAAGCTA GCCTCGAGAT CCAGATCTGC TGTGCCTTCT ACTTCCCAGC CATCTGTTCT TTCCCCCTCC 

51fi l CCCGTGCGTT CCTTCACCCT GGAAGCTGCC ACTCCCACTG TCCTTTCCTA ATAAAATGAG CAAATTGCAT 

52 51 CGCATTGTCT CAGTACGTGT CATTCTATTC TGCGGGGTGG GCTGGGGCAG CACAGCAAGG GGGAGGATTG 

532 1 CCAACACAAT AGCAGCCATC CTCG GCATCC GGTGGGCTCT ATGCGTACCC AGGTGCTGAA GAATTGACCC 

5391 CCTTCCTCCT CGGCCAGAAA GAAG CAGGCA CATCCCCTTC TCTCTGACAC ACCCTGTCCA CGCCCCTGGT 

5461 TCTTAGTTCC AGCCCCACTC ATAGGACACT CATAGCTCAG CAGGGCTCCG CCTTCAATCC CACCCGCTAA 

5531 AGTACTTGGA CCCCTCTCTC CCTCCCTCAT CAGCCCACCA AACCAAACCT AGCCTCCAAG AGTGGGAACA 
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5601 MTTAAACCA ACATAC GCTA TTAAGTGCAC ACGCACACAA AATCCCTCCA ACATGT CAGG AAGTAATCAG 
ACAAATCATA^CAATTTCTTC CCCTTCCTCC CTCACTGACT CCCTCCGCTC GCTCGTTCGG CTGCGGCGAG 

S52SSS5 SffiSJS ESSSE SSSSSS SEttSSSS caaagaacat 

CGCCCCCCTG ACGACCATCA CAAAAATCCA CCCTCAAf^rr ?£a^S2SS£ TGGCGTTTTT CCATAGGCTC 
GATACCAGCC GTTTCCCCCT CCAACCTCCC TcSr^Sr iSSSSSSS, ^CCCCACA CGACTATAAA 
CCTCTCCCCC TTTCTCCCTT CGGGAAGCGT ^TCTTCCC ACCCTGCCGC TTACCGGATA 

GTCTAGGTCG TTCCCTCCAA GCTCCCCTCT CTC^SS r^rr?^ SS 6 ™* 1 * TCTCACTTCC 
CCGGTAACTA TCCTCTTCAG TCCAACCCCC TAAr^^i £™5£H£ A GCCCGACCCC TGCCCCTTAT 

sksbs ssggs IS -S£ s~ ™s gssss 
sees s S IS SI S» sssss 
Sffi uses jsshsk ™g Hi fpsss sasjsa 
£a ss s£ ssEI ipss 

AGGGACCCAC GGTTGATCAG ACCTTTGTTG TAGGTGGACC Ig^^aS $ A j£ ATCCAG CCAGAAACTG 

pi sil iH ISS 

« s si! PPPf is 

sssg asss eSss S s ^ s sssss 

S3S3S sss$ S3 SffisS "iff? 3SSSBS 

sssgg ffig Ss^i gssssa sssss 

JESSE s&sss? si&ss 

sssssi gg£g gag S 3 ™ f£»*s ®aas' 

j&a&SS ifeasjs tSSSSS TSttlfilflS Sil 4 ""* c ™™™ cictttatct 

5SS SnWTOC SSSSa SSSKS ESSSE? Ji^TCAOA CATTTTCACA 
ATATTTCAAT CTATTTACAA AAATAAACAA IScccmf JfJSSSfS JTTCTCTCAT OAGCCOATAC 
ACCTCTAAGA AACCATTATT ATCATCACAT tUSSTTS TCCCCCAAAA GTOCCACCTC 

SS SS £™ S AGCTTGTCTC 

sssss s ™l S 1 IIS Jet 8 38588 
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FIG. 9D 



I 


TGCAAGGCCT 


AATTTGGTCC CAAAAAAGAC 


AACACATCCT TCATCTGTGG ATCTACCACA CACAAGCCTA 


71 


CTTCCCTGAT 


TGGCAGAACT ACACACCAGG 


CCCACGGATC ACATATCCAC TGACCTTTGC ATGCTGCTTC 


141 


AACTTACTAC 


CAGTTCAACC ACAGCAAGTA 


CAACAGGCCA AATAAGCAGA GAAGAACAGC TTGTTACACC 


211 


CTATGAGCCA 


CCATGGCATG GAGGACCCCG 


AGGGAGAAGT ATTACTGTGG AAGTTTCACA GCCTCCTAGC 


281 


ATTTCGTCAC 


ATCCCCCCAC AGCTGCATCC 


GGAGTACTAC AAACACTGCT CACATCGAGC TTTCTACAAG 


351 


GGACTTTCCG 


CTGGGGACTT TCCACGGAGG 


TGTGGCCTGG GCCGGACTGG CGAGTCGCGA GCCCTCAGAT 


421 


GCTACATATA 


ACCACCTGCT TTTTGCCTGT 


ACTGGGTCTC 7C7GGTTAGA CCAGATCTGA GCCTGGGAGC 


491 


TCTCTGGCTA 


ACTAGGGAAC CCACTGCTTA 


AGCCTCAATA AAGCTTGCCT TGACTGCTCA AAGTAGTGTG 


561 


TGCCCGTCTG 


TTGTGTGACT CTCGTAACTA 


GAGATCCCTC AGACCCTTTT AGTCAGTGTG GAAAATCTCT 


631 


AGCAGTGGCG 


CCCGAACAGG GACTTGAAAG 
► 


CGAAAGTAAA GCCAGAGGAG ATCTCTCGAC GCAGGACTCG 


701 


GCTTGCTCAA 


BssHII (711) 
GCGCGCacgg caagaggcga 


ggggcggcgC ctgACgagGa cgccaaaaac tttgaccagc 



Clal (830) 

771 ggaggctaga aggagagagC TCGGTGCCAG AGCGTCAGTA TGAAGCGGGG GAGAATTAGA TCGATGGGAA 



841 


AAAATTCGCT 


TAAGGCCAGG 


GGGAAAGAAA 


AAATATAAAT 


TAAAACATAT AGTATGGGCA AGCAGGGAGC 


911 


TAGAACGATT 


CGCAGTTAAT 


CCTGGCCTGT 


TAGAAACATC 


AccI (959) 
AGAAGGCTGT AGACAAATAC TGGGACAGCT 


981 


ACAACCATCC 


CTTCAGACAG 


GATCAGAAGA 


ACTTAGATCA 


TTATATAATA CAGTAGCAAC CCTCTATTGT 


1051 


GTGCATCAAA 


GGATAGAGAT 


AAAAGACACC 


AAGGAAGCTT 


TAGACAAGAT AGAGGAAGAG CAAAACAAAA 
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1121 GTAAGAAAAA AGCACAGCAA GCAGCAGCTG ACACAGGACA CAGCAATCAG GTCAGCCAAA ATTACCCTAT 



1191 


AGTCCACAAC 


ATCCAGGGGC 


AAATCG7ACA 


TCACCCCATA 


TCACCTAGAA 


CTTTAAACGA TAAGCTTGGG 


1261 


AGTTCCGCCT 


TACATAACTT 


ACGGTAAATG 


GCCCGCCTCG 


CTGACCGCCC 


AACGACCCCC GCCCATTGAC 


1331 


GTCAATAATG 


ACGTATGTTC 


CCATAGTAAC 


GCCAATAGGG 


ACTTTCCATT 


GACCTCAATG GGTGGAGTAT 


1401 


TTACGGTAAA 


CTGCCCACTT 


GGCAGTACAT 


CAAGTGTATC 


ATATGCCAAG 


TACGCCCCCT ATTGACCTCA 


1471 


ATGACGGTAA 


ATGCCCCGCC 


TCCCATTATG 


CCCAGTACAT 


GACCTTATGG 


GACTTTCCTA CTTGGCAGTA 


1541 


CATCTACGTA 


TTAGTCATCG 


CTATTACCAT 


GGTGATGCGG 


TT1TGGCAGT 


ACATCAATGG GCGTGGATAG 


1611 


CGGTTTGACT 


CACGGGGATT 


TCCAAGTCTC 


CACCCCATTG 


ACGTCAATGG 


GAGTTTGTTT TGGCACCAAA 


1681 


ATCAACGGGA 


CTTTCCAAAA 


TGTCGTAACA 


ACTCCGCCCC 


ATTGACGCAA 


ATGGGCGGTA GGCGTGTACG 


1751 


GTGGGACGTC 


TATATAAGCA 


GAGCTCGTTT 


AGTGAACCGT 


CAGATCGCCT GGAGACGCCA TCCACGCTGT 



1821 TTTGACCTCC ATAGAAGACA CCGACTCTAG AGgatccATC TAAGTAAGCT TGGCATTCCG GTACTGTTGG 



1891 


TAAAATGGAA 


GACGCCAAAA 


ACATAAAGAA 


AGGCCCGGCG 


CCATTCTATC 


CTCTAGAGGA TGGAACCGCT 


1961 


GGAGAGCAAC 


TGCATAAGGC 


TATGAAGAGA 


TACGCCCTGG 


TTCCTGGAAC 


AATTGCTTTT ACAGATGCAC 


2031 


ATATCCAGGT 


GAACATCACG 


TACGCGGAAT 


ACTTCGAAAT 


GTCCGTTCGG 


TTCGCAGAAG CTATGAAACG 


2101 


ATATCGGCTG 


AATACAAATC 


ACAGAATCCT 


CGTATGCAGT 


GAAAACTCTC 


TTCAATTCTT TATCCCCGTG 


2171 


TTGGGCCCGT 


TATTTATCGG 


AGTTGCAGTT 


GCGCCCGCGA 


ACGACATTTA 


TAATGAACGT GAATTGCTCA 


2241 


ACAGTATGAA 


CATTTCGCAG 


CCTACCGTAG 


TGTTTGTTTC 


CAAAAAGGGG 


TTGCAAAAAA TTTTGAACGT 


2311 


GCAAAAAAAA 


TTACCAATAA 


TCCAGAAAAT 


TATTATCATG 


GATTCTAAAA 


CGGATTACCA GGGATTTCAG 
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2381 TCCATCTACA CGTTCCTCAC ATCTCATCTA CCTCCCGGTT TTAATGAATA CGATTTTGTA CCACACTCCT 

2451 TTGATCGTGA CAAAACAATT GCACTGATAA TGAATTCCTC TGCATCTACT CCCTTACCTA AGGGTCTGGC 

2521 CCTTCCGCAT ACAACTCCCT GCCTCAGATT CTCGCATGCC AGAGATCCTA TTTTTGGCAA TCAAATCATT 

2591 CCGGATACTG CGATTTTAAC TCTTCTTCCA TTCCATCACG CTTTTGCAAT GTTTACTACA CTCGCATATT 

2661 TGATATGTGG ATTTCCAGTC GTCTTAATGT ATAGATTTGA AGAAGAGCTG Ti ll 1 'ACGAT CCCTTCAGGA 

2731 TTACAAAATT CAAAGTGCGT TGCTACTACC AACCCTATTT TCATTCTTCG CCAAAAGCAC TCTGATTGAC 

2801 AAATACCATT TATCTAATTT ACACGAAATT GCTTCTGGGG GCGCACCTCT TTCGAAAGAA GTCGGGGAAG 

2871 CGGTTGCAAA ACGCTTCCAT CTTCCAGGGA TACGACAAGG ATATGGGCTC ACTGAGACTA CATCAGCTAT 

2941 TCTGATTACA CCCGAGGGGG ATGATAAACC GGGCGCGGTC GGTAAAGTTG TTCCATTTTT TGAAGCGAAG 

3011 GTTGTGGATC TGGATACCGG GAAAACGCTG GGCGTTAATC AGAGAGGCGA ATTATGTGTC AGAGGACCTA 

3081 TGATTATGTC CGGTTATGTA AACAATCCGG AAGCGACCAA CGCCTTGATT GACAAGGATG GATGGCTACA 

3151 TTCTGGAGAC ATAGCTTACT GGGACGAAGA CGAACACTTC TTCATAGTTG ACCGCTTGAA GTCTTTAATT 

Clal (3259) 

3221 AAATACAAAG GATATCAGGT CGCCCCCCCT GAATTGCAAT CGATATTGTT ACAACACCCC AACATCTTCG 

3291 ACGCGCGCGT GGCAGGTCTT CCCGACGATG ACGCCGGTGA ACTTCCCGCC GCC CTTCTT G TTTTGGAGCA 

3361 CGGAAAGACG ATGACGGAAA AAGAGATCGT GGATTACGTC GCCACTCAAG TAACAACCGC GAAAAAGTTG 

3 A 31 CGCCGACGAG TTGTGTTTGT GGACGAAGTA CCGAAAGGTC TTACCGGAAA ACTCGACCCA AGAAAAATCA 

Apal (3557) 
Xhol (3548) Kpnl(3563) 

3501 GACAGATCCT CATAAAGGCC AAGAAGGGCG GAAAGTCCAA ATTGTAAcTC GAGGGGGGGC CCGGTACCTT 
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3571 TAAGACCAAT GACTTACAAG GCAGCTGTAG ATCTTAGCCA CTTTTTAAAA CAAAAGGCGG GACTCGAAGG 



3641 


GCTAATTCAC 


TCCCAAACAA 


GACAAGATAT 


CCTTCATCTC 


TGGATCTACC ACACACAACG CTACTTCCCT 


3711 


GATTGGCAGA ACTACACACC 


AGGGCCAGGG 


GTCAGATATC 


CACTGACCTT TGGATGGTGC TACAAGCTAG 


3781 


TACCAGTTGA 


CCCACATAAG 


GTAGAAGAGG 


CCAATAAAGG 


AGAGAACACC AGCTTGTTAC ACCCTGTGAG 


3851 


CCTGCATGGA 


ATGGATGACC 


CTCAGAGAGA 


AGTGTTAGAG 


TGGAGGTTTG ACAGCCGCCT AGCATTTCAT 


3921 


CACGTGGCCC 


CACACCTCCA 


TCCGGAGTAC 


TTCAAGAACT 


GCTGACATCG AGCTTGCTAC AAGGGACTTT 


3991 


CCGCTGGGGA 


CTTTCCAGGG 


AGGCGTGGCC 


TGGGCGGGAC 


TGGGGAGTGG CGAGCCCTCA GATGCTGCAT 


4061 


ATAAGCAGCT 


GCTTTTTGCC 


TCTACTCCGT 


CTCTCTGGTT 


AGACCAGATC TGAGCCTGGG AGCTCTCTGG 


4131 


CTAACTAGGG 


AACCCACTGC 


TTAAGCCTCA 


ATAAAGCTTG 


CCTTGAGTGC TTCAAGTAGT GTGTCCCCGT 


4201 


CTGTTGTGTG 


ACTCTGGTAA 


CTAGAGATCC 


CTCAGACCCT 


TTTAGTCAGT GTGGAAAATC TCTACCACCC 



4271 CCCAGGAGGT AGAGGTTGCA GTGAGCCAAG ATCGCGCCAC TGC ATTCCAG CCTGGGCAAG AAAACAAGAC 

4341 TGTCTAAAAT AATAATAATA AGTTAAGGGT ATTAAATATA TTTATACATG GAGGTCATAA AAATATATAT 

4411 ATTTGCGCTG GGCGCAGTGG CTCACACCTG CGCCCGGCCC TTTGGGAGGC CGAGGCAGGT CGATCA CCTG 

4481 AGTTTGGGAG TTCCAGACCA GCCTGACCAA CATGGAGAAA CCCCTT CTCT GTGTATTTTT ATGAGATTTT 

4551 ATTTTATGTG TATTTTATTC ACAGGTATTT CTGGAAAACT GAAACTGTTT TTCCTCTACT CTCATACCAC 

4621 AAGAATCATC AGCACAGAGG AAGACTTCTG TGATCAAATG TGGTGGGAGA GGGAGGTTTT CACCAGCACA 

4691 TGAGCAGTCA GTTCTGCCGC AGACTCGGCG GGTGTCCTTC GGTTCAGTTC CAACACCGCC TGCCTCGAGA 

4761 GAGGTCAGAC CACAGGGTGA GGGCTCAGTC CCCAAGACAT AAACACCCAA GACATAAACA CCCAACAGGT 

4831 CCACCCCGCC TGCTGCCCAG GCAGAGCGGA TTCACCAAGA CGGGAATTAG GATAGAGAAA CACTAACTCA 

4901 CACAGAGCCG GCTGTGCGGG AGAACGGAGT TCTATTATG A • CTCAAATCAG TCTCCCCAAG CATTCGGGGA 

4971 TCACA GT T1T TAAGGATAAC TTACTGTGTA GGGGG CCAGT GAGTTGGAGA TGAAAGCCTA GGGAGTCGAA 

5041 GGTGTCCTTT TGCGCCGAGT CAGTTCCTGG GTGGGGGCCA CAAGATCGGA TGAGCCAGTT TATC AATCC G 

5111 GGGGTGCCAG CTGATCCATG GAGTGCACGG TCTGCAAAAT ATCTCAAGCA CTGATTGATC TTAGGTTTTA 

5181 CAATAGTGAT GTTACCCCAG GAACAATTTG GGG AAGGTCA GAATCTTGTA GCCTGTAGCT CCATGACTCC 

5251 TAAACCATAA TTTCTTTTTT CTTTTTTTTT TTTTATTTTT GAGACAGGGT CTCACTCTGT CACCTAGGCT 

5321 GGAGTGCAGT GGTGCAATCA CAGCTCACTG CAGCCCCTAG AGCGGCCGCC ACCGCGGTGG ACCTCCAATT 

5391 CGCCCTATAG TGAGTCGTAT TACAATTCAC TGGCCGTCGT TTTACAACGT CGTGACTGGG AAAACCCTGG 

5461 CGTTACCCAA CTTAATCGCC TTGCAGCACA TCCCCCTTTC GCCAGCTGGC GTAATAGCGA AGAGGCCCCC 

5531 ACCGATCGCC CTTCCCAACA GTTGCGCAGC CTGAATGGCG AATGGCGCGA AATTGTAAAC GTTAATATTT 

5601 TGTTAAAATT CGCGTTAAAT TTTTGTTAAA TCAGCTCATT TTTTAACCAA TAGGCCGAAA TCGGCAAAAT 
5671 CCCTTATAAA TCAAAAGAAT AGACCGAGAT AGGGTTGAGT GTTGTTCCAG TTTGGAACAA GAGTCCACTA 
5741 TTAAAGAACG TGGACTCCAA CGTCAAAGGG CGAAAAACCG TCTATCAGGG CGATCGCCCA CTACGTGAAC 
5811 CATCACCCTA ATCAA G TTTT TTCGGGTCGA GGTGCCGTAA AGCACTAAAT CGGAACCCTA AAGGGACCCC 
5881 CCGATTTAGA GCTTGACGCG GAAAGCCGGC GAACGTGGCG AGAAAGGAAG GGAAGAAAGC GAAACGAGCG 
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5951 GGCGCTACGG CCCTCCCAAG TGTACCCGTC ACGCTGCGCG TAACCACCAC ACCCGCCCCG CTTAATGCGC 
6021 CGCTACAGGG CGCGTCCCAG GTGGCACTTT TCGCGGAAAT GTCCGCGGAA CCCCTATTTG TTTATTTTTC 
6091 TAAATACATT CAAATATGTA TCCGCTCATG AGACAATAAC CCTGATAAAT GCTTCAATAA TATTCAAAAA 
6161 GCAACACTAT GAGTATTCAA CATTTCCCTG TCGCCCTTAT TCCCTTTTTT CCGGCATTTT GCCTTCCTCT 
6231 TTTTCCTCAC CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT TGGGTGCACG AGTGGGTTAC 
6301 ATCCAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT TTCGCCCCGA AGAACGTTTT CCAATGATGA 
6371 CCACTTTTAA AGTTCTGCTA TCTGGCGCGG TATTATCCCG TATTGACCCC GGGGAAGAGC AACTCGGTCC 
6441 CCCCATACAC TATTCTCAGA ATGACTTCCT TGAGTACTCA CCAGTCACAG AAAAGCATCT TACGGATGGC 
6511 ATCACACTAA CAGAATTATG CAGTGCTGCC ATAACCATGA GTGATAACAC TGCGGCCAAC TTACTTCTGA 
6581 CAACGATCGG AGGACCGAAG CAGCTAACCG CTTTTTTGCA CAACATGGGG GATCATCTAA CTCGCCTTCA 
6651 TCGTTGGGAA CCGGAGCTGA ATGAAGCCAT ACCAAACGAC GACCGTCACA CCACGATGCC TGTAGCAATG 
6721 GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA CTCTAGCTTC CCGGCAACAA TTAATAGACT 
6791 CGATGGAGGC CGATAAAGTT GCAGGACCAC TTCTGCGCTC GGCCCTTCCG GCTGGCTGGT TTATTGCTGA 
6861 TAAATCTGGA CCCCCTGACC GTGGGTCTCG CGGTATCATT GCAGCACTGG GGCCAGATGG TAAGCCCTCC 
6931 CGTATCGTAG TTATCTACAC GACGGGGAGT CAGGCAACTA TGGATGAACG AAATAGACAG ATCCCTGAGA 
7001 TAGGTGCCTC ACTGATTAAG CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT AGATTGATTT 
7071 AAAACTTCAT TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA ACTTCATCAC CAAAATCCCT 
71 A 1 TAACGTGAGT TTTCCTTCCA CTGAGCGTCA GACCCCGTAG AAAAGATCAA AGGATCTTCT TGAGATCCTT 
7211 TTTTTCTGCC CGTAATCTGC TGCTTGCAAA CAAAAAAACC ACCGCTACCA GCGGTGGTTT GTTTGCCGGA 
7281 TCAAGAGCTA CCAACTCTTT TTCCGAAGGT AACTGGCTTC AGCAGAGCGC AGATACCAAA TACTGTCCTT 
7351 CTAGTGTAGC CGTAGTTAGG CCACCACTTC AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA 
7421 TCCTCTTACC AGTGGCTGCT GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA GACGATAGTT 
7491 ACCGGATAAG GCCCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC CCAGCTTGGA CCGAACGACC 
7561 TACACCGAAC TGAGATACCT ACAGCGTGAG CTATGAGAAA GCGCCACGCT TCCCGAAGGG AGAAAGGGGG 
7631 ACAGGTATCC GCTAAGCGCC AGGGTCGGAA CAGGAGAGCG CAGGAGGGAG CTTCCAGGGG GAAACGCCTG 
7701 GTATCTTTAT AGTCCTGTCG CGTTTCGCCA CCTCTGACTT GAGCGTCGAT TTTTGTGATG CTCGTCAGGG 
7771 GGGCGGAGCC TATGGAAAAA CGCGAGCAAC CCCGC C TTTT TACGGTTCCT GGC CT TTTGC TGGC CTTTT G 
7841 CTCACATGTT CTTTCCTGCG TTATCCCCTG ATTCTGTGGA TAACCGTATT ACCGCCTTTG AGTGAGCTGA 
7911 TACCGCTCCC CGCAGCCGAA CGACCGAGGG CAGCGAGTCA GTGAGCGAGG AAGGGGAAGA GCGCCCAATA 
7981 CGCAAACCGC CTCTCCCCGC GCGTTGGCCG ATTCATTAAT GCAGCTGGCA CGACAGGTTT CCCGACTGGA 
8051 AACCCGGCAG TGAGCGCAAC GCAATTAATG TGAGTTAGCT CACTCATTAG GCACCCCAGG CTTTACACTT 
8121 TATGCTTCCG GCTCCTATGT TGTGTGGAAT TCTGAGCGGA TAACAATTTC ACACAGGAAA CACCTATGAC 
8191 CATGATTACG CCAAGCTCGG AATTAACCCT CACTAAAGGG AACAAAAGCT GCTGCAGGGT CCCTAACTGC 
8261 CAAGCCCCAC AGTCTGCCCT GAGGCTGCCC CTTCCTTCTA GCGGCTGCCC CCACTCGGCT TTGCTTTCCC 
8331 TAGTTTCAGT TACTTGCGTT CACCCAAGCT CTGAAACTAG GTGCGCACAG AGCGGTAAGA CTGCGAGAGA 
8401 AAGAGACCAG CTTTACACGG GGTTTATCAC AGTGCACCCT GACAGTCCTC AGCCTCACAG GGGGTTTATC 
8471 ACATTGCACC CTCACAGTCG TCAGCCTCAC AGGGGGTTTA TCACAGTCCA CCCTTACAAT CATTCCATTT 
8541 GATTCACAAT TTTTTTAGTC TCTACTGTGC CTAACTTGTA AGTTAAATTT GATCAGAGGT GTGTTCCCAG 
8611 AGGGGAAAAC AGTATATACA GGGTTCAGTA CTATCGCATT TCAGGCCTCC ACCTCGGTCT TGGAATGTGT 
8681 CCCCCGAGGG GTGATGACTA CCTCACTTGG ATCTCCACAG GTCACAGTGA CACAAGATAA CCAAGACACC 
8751 TCCCAAGGCT ACCACAATGG GGCGCCCTCC ACGTGCACAT GGCCGGAGGA ACTGCCATGT CGGAGGTGCA 
8821 AGCACACCTG CCCATCAGAC TCCTTGGTGT GGAGGGAGGG ACCAGCGCAG CTTCCAGCCA TCCACCTGAT 
8891 GAACAGAACC TAGGGAAAGC CGCAGTTCTA CTTACACCAG GAAAGGC (SEQUENCE ID NO" 8) 



FIG. 10E 



29M5 



1 TGGAAGGGCT AATTTGGTCC CAAAAAAGAC AAGAGA7CCT TGATCTGTGG ATCTACCACA CACAAGGCTA 



71 


CTTCCCTGAT 


TGGCAGAACT 


ACACACCAGG 


GCCAGGGATC 


AGATATCCAC 


TGACCTTTGG ATGGTGCTTC 


141 


AACTTACTAC 


CAGTTGAACC 


AGAGCAAGTA 


GAAGAGGCCA 


AATAAGGAGA 


GAAGAACAGC TTCTTACACC 


211 


CTATGAGCCA 


GCATGGGATG 


GAGGACCCGG 


AGGGAGAAGT 


ATTAGTGTGG 


AAGTTTGACA GCCTCCTAGC 


281 


ATTTCGTCAC 


ATGGCCCGAG 


AGCTCCATCC 


CGAGTACTAC 


AAAGACTGCT 


GACATCGAGC TTTCTACAAG 


351 


GGACTTTCCG 


CTGGGGACTT 


TCCAGGGAGG 


TGTGGCCTGG 


GCGGGACTGG 


GGAGTGGCGA GCCCTCAGAT 


421 


GCTACATATA 


AGCAGCTGCT 


TTTTGCCTCT 


ACTCGGTCTC 


TCTGCTTAGA 


CCAGATCTGA GCCTGGGACC 


491 


TCTCTGGCTA 


ACTAGGGAAC 


CCACTGCTTA 


^ 

AGCCTCAATA 


AAGCTTGCCT 


TGAGTGCTCA AAGTAGTGTG 


561 


TGCCCGTCTG 


TTGTGTGACT 


CTGGTAACTA 


GAGATCCCTC 


AGACCCTTTT 


AGTCAGTGTG GAAAATCTCT 



631 AGCAGTGGCG CCCGAACAGG GACTTGAAAG CGAAAGTAAA GCCAGAGGAG ATCTCTCGAC GCAGGACTCG 



BssHII (711) 

701 GCTTGCTGAA GCGCGCacgg caagaggcga ggggcggcgC ctgACgagGa cgccaaaaat tttgactagc 



Clal (830) 

771 ggaggctaga aggagagagC TCGGTGCGAG AGCGTCAGTA TTAAGCCGGG GAGAATTAGA TCGATGGGAA 



841 


AAAATTCGGT TAAGGCCAGG GGG AAAGAAG AAGTACAAGC TAAAGCACAT CGTATGGGCA 


AGCAGGGAGC 


911 


AccI (959) 

TAGAACGATT CGCAGTTAAT CCTGGCCTGT TAGAAACATC AGAAGGCTGT AGACAAATAC 


TCGCACAGCT 


981 


ACAACCATCC CTTCAGACAG GATCAGAGGA GCTTCGATCA CTATACAACA CAGTAGCAAC 


CCTCTATTGT 


1051 


GTGCACCAGC GGATCGAGAT CAAGGACACC AAGGAAGCTT TAGACAAGAT ACACCAAGAG 


CAAAACAAGT 


1121 


CCAAGAAGAA GGCCCAGCAG GCAGCAGCTG ACACAGGACA CAGCAATCAG GTCAGCCAAA 


ATTACCCTAT 
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1191 ACTCCACAAC ATCCACGGGC AAATCGTACA TCAGGCCATA TCACCTAGAA CTTTAAACCA TAAGCTTGGG 



1261 


AGTTCCGCGT 


TACATAACTT ACGGTAAATG GCCCGCCTGG CTGACCGCCC AACGACCCCC GCCCATTCAC 


1331 


GTCAATAATG 


ACGTATGTTC CCATAGTAAC CCCAATAGCC ACTTTCCATT GACGTCAATG GGTGGAGTAT 


U01 


TTACGGTAAA 


CTCCCCACTT GGCAGTACAT CAAGTGTATC ATATGCCAAG TACGCCCCCT ATTGACGTCA 


1471 


ATGACGGTAA 


ATGGCCCGCC TGGCATTATG CCCAGTACAT GACCTTATGG GACTTTCCTA CTTGCCAGTA 


1541 


CATCTACGTA 


TTAGTCATCG CTATTACCAT GGTGATGCGC TTTTGCCAGT ACATCAATGG GCGTGGATAG 


1611 


CGGTTTGACT 


CACGGGGATT TCCAAGTCTC CACCCCATTG ACGTCAATGG GAGTTTGTTT TGGCACCAAA 


1681 


ATCAACGGGA 


CTTTCCAAAA TGTCGTAACA ACTCCGCCCC ATTGACGCAA ATGGGCGGTA GGCGTGTACG 


1751 


GTGGGAGGTC 


TATATAAGCA GAGCTCGTTT AGTGAACCGT CAGATCGCCT GGAGACGCCA TCCACGCTGT 


1821 


TTTGACCTCC ATAGAAGACA CCGACTCTAG AGgatccATC TAAGTAAGCT TGGCATTCCG GTACTGTTGG 



1891 TAAAATGGAA GACGCCAAAA ACATAAAGAA AGGCCCGGCG CCATTCTATC CTCTAGAGGA TGGAACCGCT 



1961 GGAGAGCAAC TGCATAAGGC TATGAAGAGA TACGCCCTGG TTCCTGCAAC AATTGCTTTT ACAGATGCAC 



2031 ATATCGAGCT GAACATCACG TACGCGGAAT ACTTCGAAAT GTCCGTTCGC TTGGCAGAAG CTATGAAACG 



2101 ATATGGGCTG AATACAAATC ACAGAATCGT CGTATGCAGT GAAAACTCTC TTCAATTCTT TATGCCGCTG 



2171 TTGGGCGCGT TATTTATCGG AGTTGCAGTT GCGCCCGCGA ACGACATTTA TAATGAACGT GAATTGCTCA 



2241 ACAGTATGAA CATTTCGCAG CCTACCGTAG TGTTTGTTTC CAAAAAGGGG TTGCAAAAAA TTTTGAACCT 



2311 GCAAAAAAAA TTACCAATAA TCCAGAAAAT TATTATCATG GATTCTAAAA CGGATTACCA GGGATTTCAG 



2381 TCGATGTACA CCTTCCTCAC ATCTCATCTA CCTCCCGGTT TTAATGAATA CGATTTTGTA CCAGAGTCCT 
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2451 



TTGATCGTGA CAAAACAATT CCACTGATAA TGAATTCCTC TGGATCTACT GGGTTACCTA AGGGTGTGGC 



2521 CCTTCCGCAT AGAACTCCCT GCGTCACATT CTCGCATGCC ACAGATCCTA TTTTTGGCAA TCAAATCATT 



2591 


CCGGATACTG 


CGATTTTAAG 


TGTTGTTCCA TTCCATCACG GTTTTGGAAT GTTTACTACA CTCGGATATT 


2661 


TGATATGTGG 


ATTTCGAGTC 


GTCTTAATGT ATAGATTTGA AGAAGAGCTG TTTTTACGAT CCCTTCAGGA 


2731 


TTACAAAATT 


CAAAGTGCGT 


TCCTAGTACC AACCCTATTT TCATTCTTCG CCAAAAGCAC TCTGATTGAC 


2801 


AAATACGATT 


TATCTAATTT 


ACACGAAATT GCTTCTGGGG CCCCACCTCT TTCGAAAGAA GTCGGGGAAG 


2871 


CGCTTCCAAA 


ACGCTTCCAT 


CTTCCAGGGA TACGACAAGG ATATGGGCTC ACTGAGACTA CATCAGCTAT 


2941 


TCTGATTACA 


CCCGAGGGGG 


ATGATAAACC GGGCGCGGTC GGTAAAGTTG TTCCATTTTT TGAAGCGAAG 


3011 


GTTCTGGATC 


TGGATACCGG 


GAAAACGCTG GCCGTTAATC AGAGAGCCGA ATTATGTGTC AGAGGACCTA 


3081 


TGATTATGTC 


CGGTTATGTA 


AACAATCCGG AAGCGACCAA CGCCTTGATT GACAAGGATG GATGGCTACA 


3151 


TTCTGGAGAC 


ATAGCTTACT 


GGGACGAAGA CGAACACTJC TTCATAGTTG ACCGCTTGAA GTCTTTAATT 


3221 


AAATACAAAG 


GATATCAGGT 


Cial (3259) 

GGCCCCCGCT GAATTGGAAT CGATATTGTT ACAACACCCC AACATCTTCG 


3291 


ACGCGGGCGT 


GCCACGTCTT 


CCCGACGATG ACGCCGGTGA ACTTCCCGCC GCCGTTGTTG TTTTGGAGCA 


3361 


* 

CGGAAAGACG ATGACGGAAA AAGAGATCGT GGATTACGTC GCCAGTCAAG TAACAACCGC GAAAAACTTG 


3431 


CGCGGAGGAG 


I TTGTGTTTGT 


GCACGAACTA CCGAAAGGTC TTACCGGAAA ACTCGACGCA AGAAAAATCA 



Apal (3557) 
XhoX (3548) KpnX(356 

3501 GAGAGATCCT CATAAAGGCC AAGAAGCGCG GAAAGTCCAA ATTGTAAcTC GAGCCCCGGC CCGGTACCTT 



3571 TAAGACCAAT GACTTACAAG GCAGCTGTAG ATCTTAGCCA CTTTT 1AAAA GAAAAGGGGG GACTCGAACC 




22 /US 



3641 CCTAATTCAC TCCCAAACAA GACAACATAT CCTTCATCTG TGGATCTACC ACACACAAGG CTACTTCCCT 



3711 


CATTCGCACA 


ACTACACACC 


AGGGCCAGGG 


GTCAGATATC 


CACTGACCTT TGGATGGTGC 


TACAAGCTAG 


3781 


TACCAGTTCA 


GCCAGATAAG 


GTAGAAGAGG 


CCAATAAAGG 


AGAGAACACC AGCTTGTTAC 


ACCCTGTGAG 


3851 


CCTGCATGGA 


ATGGATGACC 


CTGAGAGAGA 


AGTGTTAGAG 


TGGAGGTTTG ACAGCCGCCT 


AGCATTTCAT 


3921 


CACGTGGCCC 


GAGAGCTGCA TCCGGAGTAC 


TTCAACAACT 


GCTGACATCG AGCTTGCTAC 


AACCGACTTT 


3991 


CCGCTGGGGA 


CTTTCCACGG 


AGGCGTGGCC 


TGGGCGGGAC 


TGGGGACTGG CGAGCCCTCA 


GATCCTCCAT 


4061 


ATAAGCAGCT 


GCTTTTTGCC 


TGTACTGGGT 


CTCTCTGGTT 


AGACCAGATC TCAGCCTGGG 


AGCTCTCTGG 


4131 


CTAACTAGGG 


AACCCACTGC 


TTAAGCCTCA 


ATAAAGCTTG 


CCTTGACTCC TTCAAGTAGT 


GTGTGCCCGT 


4201 


CTGTTGTGTG 


ACTCTGGTAA 


CTAGAGATCC CTCAGACCCT TTTAGTCAGT GTGGAAAATC TCTAGCACCC 



4 271 CCCAGGAGGT ACACGTTGCA GTGAGCCAAG ATCGCGCCAC TGCATTCCAG CCTGGGCAAG AAAACAAGAC 

4341 TGTCTAAAAT AATAATAATA AGTTAAGGCT ATTAAATATA TTTATACATG GAGGTCATAA AAATATATAT 

4411 ATTTGGGCTG GGCGCAGTGG CTCACACCTG CGCCCGGCCC TTTGGGAGGC CGAGGCAGGT GGATCACCTG 

4481 AGTTTGCCAG TTCCAGACGA GCCTGACCAA CATGGAGAAA CCCCTTCTCT GTGTATTTTT AGTAGATTTT 

4 551 ATTTTATGTG TATTTTATTC ACAGGTATTT CTGGAAAACT GAAACTGTTT TTCCTCTACT CTGATACCAC 

4621 AAGAATCATC AGCACAGAGG AAGACTTCTG TGATCAAATG TGGTGGGAGA GGGAGGTTTT CACCAGCACA 

4691 TGAGCAGTCA GTTCTGCCGC AGACTCGGCG GGTGTCCTTC GCTTCAGTTC CAACACCGCC TGCCTGGAGA 

4761 GAGGTCAGAC CACAGGGTGA GGGCTCACTC CCCAACACAT AAACACCCAA GACATAAACA CCCAACAGGT 

4831 CCACCCCGCC TGCTGCCCAG CCAGAGCCGA TTCACCAAGA CGGGAATTAG GATAGAGAAA GAGTAAGTCA 

4901 CACAGAGCCC GCTGTGCGGG AGAACGGAGT TCTATTATGA CTCAAATCAG TCTCCCCAAG CATTCCGGGA 

4971 TCAGAGTTTT TAAGGATAAC TTAGTGTGTA GGGGGCCAGT GAGTTGGAGA TGAAAGCGTA GGGAGTCCAA 

5041 GCTGTCCTTT TGCGCCGAGT CAGTTCCTGG CTGGGGCCCA CAAGATCGGA TGAGCCAGTT TATCAATCCG 

5111 CGCCTCCCAG CTCATCCATC GACTGCAGGG TCTGCAAAAT ATCTCAACCA CTGATTGATC TTAGGTTTTA 

5181 CAATAGTGAT GTTACCCCAG CAACAATTTG CGGAAGGTCA GAATCTTGTA GCCTGTAGCT GCATGACTCC 

5251 TAAACCATAA TTTCTTTTTT CTTTTTTTTI TTTTATTTTT GAGACAGGGT CTCACTCTGT CACCTAGGCT 

5321 GGACTGCAGT GGTGCAATCA CAGCTCACTG CAGCCCCTAG AGCGGCCGCC ACCGCGGTGG AGCTCCAATT 

5391 CCCCCTATAG TGAGTCCTAT TACAATTCAC TGGCCGTCGT TTTACAACGT CGTGACTGGG AAAACCCTGG 

5461 CGTTACCCAA CTTAATCGCC TTGCACCACA TCCCCCTTTC GCCAGCTGGC GTAATAGCGA AGAGGCCCGC 

5531 ACCGATCGCC CTTCCCAACA GTTGCCCAGC CTGAATCGCG AATGGCGCGA AATTGTAAAC GTTAATATTT 

5601 TCTTAAAATT CGCGTTAAAT TTTTGTTAAA TCAGCTCATT TTTTAACCAA TAGGCCGAAA TCGGCAAAAT 

5671 CCCTTATAAA TCAAAAGAAT AGACCCACAT AGGGTTGAGT GTTCTTCCAG TTTGGAACAA GAGTCCACTA 

5741 TTAAAGAACG TGGACTCCAA CGTCAAAGGG CGAAAAACCG TCTATCAGGG CGATCGCCCA CTACGTGAAC 

5811 CATCACCCTA ATCAAGTTTT TTGGGGTCGA GGTGCCGTAA AGCACTAAAT CGGAACCCTA AAGGGAGCCC 

5881 CCGATTTAGA GCTTGACGGG GAAAGCCGGC CAACGTGGCC AGAAAGGAAG GGAAGAAAGG GAAAGGAGGG 

5951 GGCGCTAGGG CCCTGGCAAG TGTAGCGCTC ACGCTGCGCG TAACCACCAC ACCCGCCGCG CTTAATGCGC 

6021 CGCTACAGCC CCCGTCCCAG CTGGCACTTT TCGGGG AAAT GTCCGCGGAA CCCCTATTTG TTTATTTTTC 

6091 TAAATACATT CAAATATGTA TdCGCTCATG AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA 
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6161 CCAAGACTAT GAGTATTCAA CATTTCCGTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT GCCTTCCTGT 
6231 TTTTGCTCAC CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT TGCGTGCACG AGTGGGTTAC 
6301 ATCGAACTGG ATCTCAACAG CGCTAAGATC CTTGAGAGTT TTCGCCCCGA AGAACGTTTT CCAATGATGA 
6371 GCACTTTTAA AGTTCTGCTA TGTGGCGCGG TATTATCCCG TATTGACGCC GGGCAAGAGC AACTCGGTCG 
6 A 41 CCGCATACAC TATTCTCAGA ATGACTTGGT TGAGTACTCA CCAGTCACAG AAAAGCATCT TACGCATGGC 
6511 ATGACAGTAA GAGAATTATG CAGTGCTGCC ATAACCATCA GTGATAACAC TGCGGCCAAC TTACTTCTCA 
6581 CAACGATCGG AGGACCGAAG GAGCTAACCG CTTTTTTGCA CAACATGGGG GATCATGTAA CTCGCCTTGA 
6651 TCGTTGGGAA CCGGAGCTGA ATCAAGCCAT ACCAAACCAC GAGCGTGACA CCACGATGCC TGTAGCAATG 
6721 GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA CTCTACCTTC CCGGCAACAA TTAATAGACT 
6791 GGATCCAGGC. GGATAAAGTT GCACGACCAC TTCTGCGCTC GGCCCTTCCG GCTGGCTGGT TTATTGCTGA 
6861 TAAATCTGGA GCCGGTGAGC GTGGGTCTCG CGGTATCATT GCAGCACTGG GGCCAGATGG TAAGCCCTCC 
6931 CGTATCCTAG TTATCTACAC GACGGGGAGT CAGGCAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA 
7001 TAGGTGCCTC ACTGATTAAG CATTCGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT AGATTGATTT 
7071 AAAACTTCAT TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA ATCTCATGAC CAAAATCCCT 
7141 TAACGTGAGT TTTCCTTCCA CTGAGCGTCA GACCCCGTAG AAAAGATCAA AGGATCTTCT TGAGATCCTT 
7211 TTTTTCTGCG CGTAATCTGC TGCTTGCAAA CAAAAAAACC ACCGCTACCA GCGGTGGTTT GTTTGCCGGA 
7281 TCAAGAGCTA CCAACTCTTT TTCCGAAGGT AACTGGCTTC AGCAGAGCGC AGATACCAAA TACTGTCCTT 
7351 CTAGTGTAGC CGTAGTTAGG CCACCACTTC AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA 
7421 TCCTGTTACC AGTGCCTGCT GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA GACGATAGTT 
7491 ACCGGATAAG GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC CCAGCTTGGA GCGAACGACC 
7561 TACACCGAAC TGAGATACCT ACAGCGTGAG CTATCAGAAA GCGCCACGCT TCCCGAAGGG AGAAAGGCGG 
7631 ACAGGTATCC GGTAAGCGGC AGGGTCGGAA CAGGAGAGCG CACGAGGGAG CTTCCAGGGG GAAACGCCTG 
7701 GTATCTTTAT AGTCCTGTCG GGTTTCGCCA CCTCTGACTT GAGCGTCGAT TTTTGTGATG CTCGTCAGGG 
7771 CCGCCGACCC TATGGAAAAA CGCCAGCAAC GCGGCCTTTT TACGGTTCCT GGCCTTTTGC TGGCCTTTTG 
7841 CTCACATCTT CTTTCCTGCG TTATCCCCTG ATTCTGTGGA TAACCGTATT ACCGCCTTTG ACTGAGCTGA 
7911 TACCGCTCGC CGCAGCCGAA CGACCGAGCG CACCGAGTCA GTGAGCGAGG AAGCGGAAGA GCGCCCAATA 
7981 CGCAAACCGC CTCTCCCCGC GCGTTGGCCG ATTCATTAAT GCAGCTGGCA CGACAGGTTT CCCGACTGGA 
8051 AAGCGGGCAG TGAGCCCAAC GCAATTAAJG TGAGTTAGCT CACTCATTAG GCACCCCAGG CTTTACACTT 
8121 TATGCTTCCG GCTCGTATGT TGTGTGGAAT TGTCAGCGGA TAACAATTTC ACACAGGAAA CAGCTATGAC 
8191 CATGATTACG CCAACCTCCG AATTAACCCT CACTAAAGGG AACAAAAGCT GCTGCAGGGT CCCTAACTGC 
8261 CAAGCCCCAC AGTCTGCCCT CAGGCTGCCC CTTCCTTCTA GCGGCTGCCC CCACTCGGCT TTGCTTTCCC 
8331 * TAGTTTCAGT TACTTGCGTT CAGCCAAGGT CTGAAACTAG GTCCGCACAG AGCGGTAAGA CTGCGAGAGA 
8401 AAGAGACCAG CTTTACAGGG GGTTTATCAC AGTGCACCCT GACAGTCGTC AGCCTCACAG GGGGTTTATC 
8471 ACATTCCACC CTGACAGTCG TCAGCCTCAC AGGGGGTTTA TCACAGTGCA CCCTTACAAT CATTCCATTT 
8541 GATTCACAAT TTTTTTAGTC TCTACTCTGC CTAACTTGTA AGTTAAATTT GATCAGAGGT GTGTTCCCAG 
8611 AGGGGAAAAC AGTATATACA GGGTTCAGTA CTATCGCATT TCAGCCCTCC ACCTGGGTCT TGGAATGTGT 
8681 CCCCCGAGGG CTGATGACTA CCTCAGTTGG ATCTCCACAG CTCACAGTGA CACAAGATAA CCAAGACACC 
87 51 TCCCAAGGCT ACCACAATGG GCCGCCCTCC ACGTGCACAT GGCCGGAGGA ACTGCCATGT CGGAtJGTGCA 
8821 AGCACACCTG CGCATCAGAG TCCTTGGTGT GGAGGGAGGG ACCAGCGCAG CTTCCAGCCA TCCACCTGAT 
8891 GAACAGAACC TAGGGAAAGC CCCAGTTCTA CTTACACCAG GAAAGGC (SEQUENCE ID NO: 9) 
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BC/HXB2 

BC/NL43 

#1 

CGCGCACGGC AAGAGGCGAG GGGCGGCGAC TGGTGAGTAC GCCAAAAATT 



mBCwCN frag C- C 
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BC/HXB2 T— 

BC/NL4 3 G 

HS1 

TTGACTAGCG GAGGCTAGAA GGAGAGAGAT GGGTGCGAGA GCGTCAGTAT 
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Gag production from the Rev-Independent gag-pol HIV- 
1 vector pCMVBNkon 



2,000,000 
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Transduction of lymphoid cells (Jurkat) 
50 n 




Rev- dependent mGag-Pol 
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FIG. I5C 




FIG. I5D 
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BsrGI (37) 

1 CCTGGCCATTCCATACGTTGTATCCATATCATAATATGTA^ 
81 CATTGATTATTGACTAGTTATTAATAGTAATCAATTA 
161 ATAACTTACGGTAAATGGCCCGCCTGGCT^ 
241 TAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTA 
321 GTGTATCATATGCX^AAGTACGCCCCCTATTGACGTCAATGACGGTAAATQ 

SnaBI (432) 

401 CTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTA 
481 TCAATCGGCGTGGATAGCGGTTTX^CrcACGGGGATTTOCAA 
561 CACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTC^ 
641 GGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACCGTCAGATCGCCTC 

Sacll (746) 

721 GAAGACACCGGGACCGATCCAGCCTCCGCGGGCCGCGCT^ 

l^MetGlyCysLeuGI yAsnGI nLeuLeu 1 1 eAI a 1 1 eL 

801 TGCTTTTAAGTGTCTATGGGATCTATTGTACTCTA 

13>euLeuLeuSerValTyrGI y 1 1 eTyrCysThrLeuTyrVal Thr Val PheTyrGlyVal ProAl aTrpArgAsnAI aThr 

881 ATTCCCCTCTTTTGTGCAA(XAAG^ 

40> 1 1 eProLeuPheCysAI aThr LysAsnArgAspThr TrpGi yThrThrGI nCysLeuProAspAsnGI yAspTyrSer Gl 

961 AGTGGCCCTTAATGTTACAGAAAGCTT^ 

66*uValAI aLeuAsnValThrGI uSerPheAspAl aTrpAsnAsnThr Val Thr Gl uGl nAI a 1 1 eGl uAspValTrpGI nL 

1041 TCTTTGAGACCTCAATAAAGCCTTGTGTAAAATC 

93*euPheGI uThrSer 1 1 eLysProCysVal LysLeuSer ProLeuCysl I eThrMat ArgCysAsnLysSer Gl uThrAsp 

1121 AGATGGGGATTGACAAAATCAATAACAACAACAGCATCAACAACATC 

120*A rgTrpGI yLeuThr LysSer 1 1 eThr Thr Thr Al aSerThr Thr SerThrThr Al aSer Al aLysVal AspMet Val As 

1201 TGAGACTAGTTCTTGTATAQCCCAGGATAATTGCAC& 

146* nGI uThr Ser Ser Cys 1 1 eAI aGI nAspAsnCysThr Gl yLeuGI uGl nGI uGl nMst 1 1 eSer CysLysPheAsnMetT 

Pstl (1329) 

1281 CAGGGTTAAAAAGAGACAAGAAAAAAGAGTACAATGAAACTTGGTACTC 

173t hr Gl yLeuLys ArgAspLysLysLysGI uTyrAsnGI uThr TrpTy rSer Al aAspLeuVal CysGI uGl nGI yAsnAsn 

1361 ACTGGTAATGAAAGTAGATGTTAC^TGAACCACTGTAACACT 

200 ► Thr Gl yAsnGI uSer A rgCysTyrMet AsnHI sCysAsnThr Ser Val 1 1 eGl nGI u Ser CysAspLy sHi sTy rTrpAs 
1441 TGCTATTAGATTTAGGTATTGTGCACCTCCAGGTTATGCT^ 

226* pAI al I eArgPheArgTyrCysAI aProProGI yTyrAI aLeuLeuArgCysAsnAspThrAsnTyrSerGI yPheMet P 

253*roLysCysSerLysVal Val Val Ser Ser Cys Thr ArgMetMstGI uThrGI nThrSerThrTrpPheGI yPheAsnGly 

1601 ACTAGAGCAGAAAATAGAACTTATATTTACTGGCATGGTAGGGATAATA 

280>ThrArgAI aGI uAsnArgThr Tyrl I eTyrTrpHisGI yArgAspAsnArgThr 1 1 el I eSer LeuAsnLysTy rTyrAs 

1681 TCTAACAATCAAATCTASAAGACC^GGAAATAA^ 

306> nLeuThrKtet LysCysArgArgProGI yAsnLysThr Val LeuProVal Thr 1 1 eMetSerGI yLeuVal PheHi sSerG 
Xcml (1778) 

1761 AACCAATCAATGATAGGCCAAAGCAGGCATGGTGTT 

333> I nProl I eAsnAspArgProLysGI nAI aTrpCysTrpPheQ yGI yLysTrpLysAspAlal I eLysGI uVal LysGJ n 
1841 ACGATTGTCAAACATCCCAGGTATACnxSGAACr 

360> Thr 1 1 eVal LysHi sProArgTy rThr Gl yThrAsnAsnThrAspLys 1 1 eAsnLeuThr Al aProGI yGI yGI yAspP r 

1921 GGAAGTTACCTTCATGTGGACAAATTGCAGAGGAGAGTTCCTCT 

386>oGI uValThr PheMetTrpThrAsnCysArgGI yGI uPheLeuTyrCysLysMetAsnTrpPheLeuAsnT rpVal Gl uA 

2001 ATAGGAATACAGCTAACCAGAAGCCAAAGGAACAGCATAAAAG 

413^spArgAsnThr Al aAsnGI nLysProLysGI uGl nHi sLysArgAsnTyrVal ProCysHlsl I ArgGI nl I ell eAsn 
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Pmll (2134) 

2081 ACTTGGCATAAAGTAGGGAAAAATGTTTATTTGCCTC^ 

440>ThrTrpHisLysVal Gl yLysAsnVal TyrLeuProProArgGI uGl yAspLeuThr CysAsnSerThr Val Thr Ser Le 
2161 CATAGCAAACATAGATTGGATTX^TGGAAACCAAACTAATATCACCATGA 

466>ul I eAI aAsnl leAspTrpI I eAspGI yAsnGJ nThrAsnl I eThr Met Ser Al aGl uValAI aGI uLeuTy r ArgLeuG 
2241 AATTGGGAGAT TAT AAATT AGTAGAGATC AC TCCAATTGGCT TGGCCC CCAC AGATGTGAAGAGGT ACACT AC TGGTGGC 

493* I uLeuGJ yAspTyrLysLeuVal GJ u I I eThr Pro 1 1 eGl yLeuAl aProThrAspVal Lys ArgTyrThr Thr Gl yGI y 

BspMI (2378) 

2321 ACCTCAAGAAATAAAAGAGGGGTCTTTGTGCTAGGGTTCT^ 

520^ Thr Ser A rgAsnLysArgGI yVal PheVal LeuGi yPheLeuGI yPheLeuAl aThr Al aGI ySer Al afcbtGI yAl aAl 
2401 CAGCCTGACCCTCACGGCACAGTCTCGAACTTTATTGGCT 

546* aSer LeuThr LeuThr Al aGI nSer ArgThr LeuLeuAl aGI y 1 1 eVal Gl nGI nGI nGI nGI nLeuLeuAspVal Val L 

Earn It 051 (2502) 

2481 AGAGACAACAAGAATTGTTGCGACTGACCGTCTGGGGAACAAAGAACCTCC^ 

573*ysArgGI nGI nGI u Leu Le u A rg LeuThr Val TrpGI yThr LysAsnLeuGI nThr ArgVal Thr Al al I eGl uLysTyr 
2561 TTAAAGGACCAGGCGCAGCTX^KOTKXX^ 

600> LeuLysAspGI nAI aGI nLeuAsnAI aTrpGI yCysAI aPheArgGI nVal CysHI sThrThr Val ProTrpProAsnAI 
2641 AAGTCTAACACCAAAGTGGAACAATGAGACTTGGCAAGAGTGGGAGCGA^ 

626>aSer LeuThr ProLysTrpAsnAsnGI uThrTrpGlnGI uTrpGI uArgLysVal AspPheLeuGI uGl uAsn 1 1 eThr A 

2721 CCC TC CT AGAGGAGGCACAAATTCAAC AAGAGAAGAACATGT ATGAATC TTGAAT AGCTGGGATGTGTTTGGC 

653* I aLeuLeuGI uGl uAI aGI n 1 1 eGl nGI nGI uLysAsnMet TyrGI uLeuGI nLy sLeuAsnSer TrpAspVal PheGI y 
2801 AATTGGTTTGACCTTGCTTCTTGGATAAAGTATATACA^ 

680>AsnTrpPheAspLeuA1 aSerTrpI I eLysTyrl I eGl nTyrGI yVal Tyr 1 1 eVal ValGlyVal 1 1 eLeuLeu Arg 1 1 
2881 AGTGATCTATATAGTACAAATGCTAGCTAAGTTAAGGCAGGGGTATAGGCC 

706>eVal 1 1 eTyrll eVal Gl nMatLeuAl aLysLeuArgQ nGI yTyrArgProVal PheSer Ser ProProSer TyrPheG 
PpuMI (2979) 
2961 AGCAGACCCATATCCAAC^GGACCCGGC^^ 

733> I nGI nThr Hi s 1 1 eGl nGI nAspProAl aLeuProThr ArgGI uGl yLysGI u ArgAspQ yGI yGI uGl yGI yGI yAsn 
3041 AGCTCCTGGCCTTGGCAGATAGAATATATCCAC ITTCT l'ATTCGTCAGCTTATTAGACTCTTGACTTGGCTATTCAGT^ 

760* Ser SerTrpProTrpGI n II eGl uTyrl I eHI sPheLeu 1 1 eArgGi nLeu 1 1 eArgLeuLeuThrTrpLeuPheSerAs 
3121 CTGTAGGACTTTX^ATCGAGAGTATACCAGATCXnX 

786*nCysArgThr Leu LeuSer ArgVal TyrGI nil eLeuGI nProl I eLeuGI nArgLeuSer Al aThrLeuGl nArgl leA 
Bsu36t (3208) 

3201 GAGAAGTCCTCAGGACTGAACIX^CCTACCTACAATATGGGTG^ 

813> rgGI uVa I Leu ArgThr Gl uLeuThr Ty rLeuGI nTyrGI yTrpSer TyrPheHi sGI uAI aVal Gl nAI a Val T rpArg 
3281 TCTGCGACAGAGACTCTTGCGGGCGCGTG^ 

840>SerAI aThr Gl uThr Leu Al aGI yAlaTrpGI yAspLeuT rpGI uThr Leu Arg ArgGI yGI yA rgTrpI I eLeuAl a 1 1 

BamHI (3418) 
EcoRI (3412) 

3361 CCCCAGGAGGATTAGACAAGGGCTTGAGCTCACrc tCtagaCTOGA 

866* eProArgArgl I e ArgGI nGI yLeuGI uLeuThr LeuLeu* • • 
Eco47lll (3457) 
3441 GGGGGGGCCCGGTACGAGOXITAGCTAGCTAGAa^ 



3521 AGAGTGAC ATTTTTCACTAACCT AAGACAGGAGGGCCXsTCAGAGC^ TAATCCAAAGACGGGTAAAAGTGATAAA 

BstEII (3673) 

3 601 AATGTATCACTCCAACCTAAGACAGGCGCAGCTTtXGAGGGATTTC 
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BsaBI (3740) 

3681 GTCCGGAGCCGTGCTGCCCGGATGATGTCTTGGTCTAGACT^ 



3761 CTAGTTGCCAGCCATCTGTTGTTTGCCCCT^ 

3841 TAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTC 

Sphl (3948) Kpnl (3976) 

3921 GGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGGTA 

BstXI (4060) 

4001 CCGGTTCCTCCTGGGCCAGAAAGAAGCAGGCACATCCCCTO 

4081 CCAGCCCCACTCATAGGACACTCATAGCTCAGGAGGGCTCCGOCTTCAATCCCACCC^ 
4161 TCCCTCCCTCATCAGCCCACCAAACCAAACCTAGCCTCCAAGAGTGGGAAGAAATTAA^ 

Xmnl (4293) 

4241 AGAGGGAGAGAAAATGCCTCCAACATGTGAGGAAGTAATGAGAGAAATCATAGAATTTCTTC^ 

► 

4321 CTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGC 

4401 GGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCA^ 

4481 TTCCATAGGCTCCGCCCCCXTGA^ 

4561 AAGATACCAGGCGTTTCCCCCTGGAAGCTCCCT 

4641 CCTTTCTCOCTTCGGGAAGOGTX3GOGCTTTC 

4801 GGTAAGACACGACTTATaXX^CTGGCAGCAGCCA 

4881 AGTTCTTGAAGTGGTGGCCTAACTACGGCTACA 

4961 TTCGGAAAAAGAGTIX3GTAGCTCTTGAItXX3GCAAACAA^ 

5041 GATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTT^ 

5121 CACGTTAAGGGAITTTGGTCATGAGAl^^ 

5201 TCAATCTAAAGTATATATGAGTAAACTTGGTCTGACA^ 

5281 TCTATTTCGTTCATCCATAGTTGCCT^ 

Stul (5368) 

5441 CCAGTTGGTGATTTTXGtf^CTTTTGCT 

5521 CAGCAAAAGTTCGATTTATTCAACAAAGCCGCCGTCCC^ 

5601 CCAATTCTGATTAGAAAAACTCATCGAGCATCAAATGAAACT^ 

271* PhePheGI uAspLeuMet LeuHi sPheGI nLeuLysAsnMetAspProAsnAspl I eGl yTyrLysG 
5681 TGAAAAAGCCGTTTCTGTAATGAAGGAGAAAACTCA^ 
248* 1 nPheLeuArgLysGI nLeuSer P roSer PheGI uGl yLeuCysAsnTrpLeul I eAI ateuAspGl nTyrArgAspAl a 

2224 1 1 eGl yVal A rgGI yVal Asp 1 1 eCysGJ y 1 1 oLeuLysQ yGI uAspPhel I eLeuAsnAspLeuSer PheAspG! yHi 
5841 GAGTGACGACTGAATCCGGTGAGAATGGCAAAAGC^ 
195* sThr Val Val SerAspProSer PheProLeuLeuLysHi sMetGI uLysTrpVal Gl nGI uVal ProTrpGI yAsnArgG 

Pvul (5993) 
Sgfl (5992) 

5921 TCGTCATCAAAATCACTCGCATCAACCAAACCXnTA 
168* i uAspAspPheAspSer Ai eAspVal LeuGi yAsnAsnMe t A rgSer Q nAl aGl nAI aLeuArgPheVal ArgAspSer 

BsrFI (6036) Sspl (6067) 

6001 GTTAAAAGGACAATTACAAACAGGAATCGAATGCAArc 
l42*AsnPheProCysAsnCysVal Pro 1 1 eSer Hi sLeuArgArgLeuPheVal Al aLeuAl aAspVal 1 1 eAsnGI uGl ySe 

Smal (6118) 

6081 AATCAGGATATTCTTCTAATACCTGGAATGCTGTT^ 

115* rAspProTyrGI uGl uLeuVal Gl nPheAl aThr LysGI yProl I eAI aThrTnr LeuLeuT rpAl aAspAspP roThr A 
6161 CGGATAAAATGCTTGATGGTCGGAAGAGGCATAAA 

88* r g 1 1 ePheHi sLysl I eThr ProLeuProMet PheGI uThr LeuTrpAsnLeuArgValMetGI uAspThr Val AspAsn 
6241 GGCAACGCTACCTTTGCCATGTTTCAGAAACAACT 

62*AI aVal Ser Gl y LysGI yHl sLysLeuPheLeuQ uProAl aAspProLysGl yTyrLeuArgTy rl I eThr AI aGl ySe 
Nrul (6335) 
6321 ATTGCCCGACATTATCGCGAGCXrATTTA^ 

35* r Gl nGI yVal AsnAspArgAI aTrpLy sTyrQ y Ty rLeuAspAl aAspMatAsnSerAsnLeuArgProArgSer CysS 
6401 GACGTTTCCCGTTGAATATGGCTCATAACACTC 
8* rThfGJuArgGJnl leHisSerfctet 

Dralll (6523) 

6481 TATATTTTTATCTTGTGCAATGTAACATCAGAGATTTT^ 
6561 TTTATCAGGGTTATTGTCTCATGAGCX^TACATATTT 
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6641 tttcco:gaaaagtgccacctgacgtc 

6721 GCCCTTTCGTCTCGCGCGriTCGGTGAT^ 

6801 TGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGC^GCGTCAGCGGG 

6881 GCGGCATCAGAGCAGATTG TACTGAGAGTGCACCATATGCGGTG rGAAATACCGCACAGATXSCG TAAGG AGAAAATACCG 

6961 CATCAGATTGGCTATTGG 



